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Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2219 

A DNA sequence (GBSx2338) was identified in S.agalactiae <SEQ ID 6847> which encodes the amino 
5 acid sequence <SEQ ID 6848>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

>» Seems to have an \mcleavable N-term signal seq 

Final Results 

10 bacterial membrane CertaintyssO. 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintyi=0 . 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

15 >GP:AaF96286 GB:AE004374 hypothetical protein [Vibrio cholerae] 

Identities = 56/167 (33%) , Positives = 89/167 (52%) , Gaps = 12/167 (7%) 

Query: 18 LAIIKSLPLNDCWLCAGTLRNFVWNKLS-GINETLTSDIDWFFDKNI SYEETWLE 73 

L + L L C++ AG +RN VW+ L + T +DIDV+FFD + YE++ LE 

20 Sbjct: 41 LECVYQLELPQCYIAAGFWRHLVWDSIiHHNVKLTPIiraJIDVIFFnftDCLDSDYEKS-- 98 

Query: 74 QQLKI«NYPQYDVffiLKNEFYM^m^SPNTPKyrSSKDAISKFPEKCTAVGARLDD!a^QLEL^ 133 

+L + PQ +W++KN+ M+ + + P Y S+ DA+S +PEK TAV R + ++ E 
Sbjct: 99 LKLSEQMPQIJSIWQVKNQAKMHLQNGDNP-YQSTLDaMSYWPEKETAVAVRKVEHDRYECI 157 

25 

Query: 134 LPYGEEEILNFIVSPTPYFEEDLLRYNVYLKRVDKKKWNNIWPRLTI 180 

+G E + ++ P Y ++ RV K W +WP L I 

Sbjct: 158 SAFGFESLFQGFITHNP KRAYGIFENRVKSKGWLAMWPNLRI 199 

30 No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2220 

A DNA sequence (GBSx2339) was identified in S.agalactiae <SEQ ID 6849> which encodes the amino 
35 acid sequence <SEQ ID 6850>. Analysis of this protein sequence reveals the following: 

Possible site: 17 

»> Seems to have no N-terminal signal sequence 

Final Results 

40 bacterial cytoplasm Certainty=0 .2779 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco . 

The protein has homology with the following sequences in the GENPEPT database. 

45 >GP:CAB13060 GB:Z99110 yjdF [Bacillus subtilis] 

Identities = 47/138 (34%) , Positives = 93/138 (67%) , Gaps = 2/138 (1%) 

Query: 1 MKMTVYFDGNFWLGLIEYDDDGDYKVFRYFFGKEPKDDDVENFINHKLNDLIKKYEFV^ 60 
MK+T+Y+DG FW+G++E D+G + FR+ FGKEP+D +V F++++L +++ + E + 
50 Sbjct: 24 MKLTIYYIX3QFWVGVVEVVDNGKIiRAFRHLFGKEPRDSEVLEFVHNQLENMMAQaE--QE 81 

Query: 61 DISLKRTNEHKKSPKRMQREINREKEKPVVSTKJiQLAMKTIHMSIKNERQLSQKCKiCNEL 120 

+ L+ + K +PKR+QR++++E + V++KAQ A+K + K +++ K ++ + 
Sbjct: 82 GVRLQGRRQKKINPKRLQRQVSKELKNAGVTSKAQEAIKLELEARKQKKKQIMKEQREHV 141 

55 
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Query: 121 RKHRYQLKQEKRYQKKKG 138 

++ RY LK++K +K +G 
Sbjct: 142 KEQRYMLKKQKAKKKHRG 159 

5 No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2221 

A DNA sequence (GBSx2340) was identified in S.agalactiae <SEQ ID 685 1> which encodes the amino 
10 acid sequence <SEQ ID 6852>. This protein is predicted to be ComXl. Analysis of this protein sequence 
reveals the following: 

Possible site: 52 

»> Seems to have no N-termiual signal sequence 

15 Final Results 

bacterial cytoplasm Certaintyi=0. 3143 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside CertaintyiO.OOOO(Not Clear) < suco 

20 A related GBS nucleic acid sequence <SEQ ID 9469> which encodes amino acid sequence <SEQ ID 9470> 
was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 



25 



>GP:AAD50429 GB:AF151701 ComX2 [Streptococcus pneumoniae] 
Identities = 61/152 (40%) , Positives = 95/152 (62%) 

Query: 5 EELFDKVKPIVMKLRRNYFVQLWEYDDWIQEGRIVLFRLLEEHPYLLDNESKLFIYFKT^ 64 

+EL+++V+ V K R Y++ LWE DW QEG + L I1+ L+D+ +L YFKTK 

Sbjct: 3 KELYEEVQGTVYKCRNEYYLHLWELSDWDQEGMLCLHELISREEGLVDDIPRLRKYFKTK 62 

30 Query: 65 FSNYLNDVLRHQDCQKRQENKMPyEEISEVSHYVKSKGLVLDDYIAYRDTLTKVEETLSD 124 

F N + D +R Q+ QKR+++K PYEE+ E+SH + GL LDDY + +TL S 
Sbjct: 63 FRNRILDYIRKQESQKRRYDKEPYEEVGEISHRISEGGLWLDDYYLFHETLRDYRNKQSK 122 

Query: 125 IDKEKFEKLISGERFAGKRQFIRDIQPFEMAF 156 
35 +E+ E+++S ERF G+++ +RD++ F F 

Sbjct: 123 EKQEELERVLSNERFRGRQRVLRDLRIVFKEF 154 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6853> which encodes the amino acid 
sequence <SEQ ID 6854>. Analysis of this protein sequence reveals the following: 

40 Possible site: 39 

>» Seems to have an vmcleavable N-term signal seq 

INTEGRAL Likelihood =-10.35 Transmembrane 9 - 25 ( 7 - 28) 

45 Final Results 

bacterial membrane Certainty=0. 5140 (Affirmative) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

50 A related sequence was also identified in GAS <SEQ ID 9163> which encodes the amino acid sequence 
<SEQ ID 9164>. Analysis of this protein sequence reveals the following: 

Possible site: 29 
>» Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood =-10.35 Transmembrane 2 - 18 ( 1-18) 
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Final Results 



bacterial membrane Certainty=0 . 160 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 



The protein has homology with the following sequences in the databases: 

>GP:AAD50429 GB:AF161701 ComX2 [Streptococcus pneumoniae] 
Identities = 60/149 (40%) , Positives = 98/149 (65%) 

Query: 41 PEKVKPIILKLKRHYYIQLWDRDDWLQEGHIILLQLLERYPELIEEEERLYRYFKTKFSS 100 

+E+V+ + K + YY+ LW+ DW QEG + L +L+ R L+++ RL +YFKTKF + 
Sbjct: 6 YEEVQGTVYKCRiraYYIiHLVffiLSDTOQEGMLCIi^LISREEGLVDDIPRLRK!fFKrKFRN 65 

Query: 101 YLKDLLRRQESQKRQFHKLAYEEIGEVAHAIPSRGLWLDDYVAYQEVIASLENQLNSQER 160 

+ D +R+QESQKR++ K YEE+GE++H I GLWLDDY + E + N+ + +++ 
Sbjct: 66 RILDYIRKQESQKRRYDKEPYEEVGEISHRISEGGLWLDDYYLFHETLRDYRNKQSKEKQ 125 

Query: 161 MQFQALIRGERFRGRRALLRKISPYFKEF 189 

+ + ++ ERF+GR+ +IiR + FKEF 
Sbjct: 126 EELERVLSNERFRGRQRVLRDLRIVFKEF 154 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 78/149 (52%) , Positives = 116/149 (77%) 

Query: 8 FDKVKPIVMKLRRNYFVQLWEYDDWIQEGRIVLFRLLEEHPYLLDNESKLFIYFKTKFSN 67 

F+KVKPI++KL+R+Y++QLW+ DDW+QEG I+L +LLE +P L++ E +L+ YFKTKFS+ 
Sbjct: 41 FEKVKPIILKLKRHYYIQLWDRDDWLQEGHIILLQLLERYPELIEEEERLYRYFKTKFSS 100 

Query: 68 YUTOVLRHQDCQKRQENKMPYEEISEVSHYVKSKGLVLDDYIAYRDTLTKVEETLSDIDK 127 

YL D+LR Q+ QKRQF+K+ YEEI EV+H + S+GL LDDY+AY++ + +E L+ ++ 
Sbjct: 101 YLKDIiRRQESQKRQFHKLAYEEIGEVAHAIPSRGLWLDDYVAYQEVIASLENQIJiiSQER 160 

Query: 128 EKFEKLISGERFAGKKQFIRDIQPFFNAF 156 

+F+ LI GERF G++ +R I P+F F 
Sbjct: 161 MQFQALIRGERFKGRRALLRKISPYFKEF 189 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2222 

A DNA sequence (GBSx2341) was identified in S.agalactiae <SEQ ID 6855> which encodes the amino 
acid sequence <SEQ ID 6856>. Analysis of this protein sequence reveals the following: 

Possible site: 57 

>» Seems to have no N-terrainal signal sequence 

INTEGRAL Likelihood = -2.23 Transmembrane 166 - 182 ( 166 - 182) 



Final Results 



bacterial membrane 
bacterial outside 
bacterial cytoplasm 



- CertaintyteO. 1893 (Affirmative) < suco 

- Certainty^O . 0000 (Not Clear) < suco 

- Certainty=0.0000(Not Clear) < suco 



The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAA99510 GB:Z75191 ORF Y0R283W [Saccharomyces cerevisiae] 
Identities = 57/226 (25%) , Positives = 97/226 (42%) , Gaps = 22/226 (9%) 



Query: 4 VRLYIARHGKTMENTIGRAQGWSDTPLTTF6ELGIKELGLGLKASNISFKEAFSSDSGRT 63 

+RL+I RHG+T N QG DT + GE +LG L++ I F + SSD R 

Sbjct: 17 IRLFIIRHGQTEHNVKKILQGHKDTSINPTGEEQATKLGHYLRSRGIHFDKWSSDLKRC 76 
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Query: 64 LQTMEIILREVQQENIPYTRDKRIREWCFGSLDGGYDGDLENGVLPRVSNGDMSHLTHEE 123 

QT +QEN+P + +RE G ++G M E+ 

Sbjct: 77 RQTTALVLKHSKQENVPTSYTSGLRERYMGVIEG MQITEAEK 118 

5 Query: 124 IMILICQVDTAGWAEPWAILSITOILSGFTAIAKKIEDIGGGISIAIWSHGOTIATFL-W 181 

A++ +E R+ ++GN +VSHG I L WL 

Sbjct: 119 YADKHGEGSFRNFGEKSDDFVARLTGCOTEEVAEASNEGVKNLALVSHGGAIRMILQWLK 178 

Query: 182 IDHSTPRSLGLDNGSVSVVDF--EDGTFSIQSIGDMSyREKGREIL 225 
10 ++ + + N SV++VD+ + F ++ +G+ + G ++ 

Sbjct: 179 YENHQAHKIIVEOTSVTIVDYVKDSKQFIVRRVGNTQHLGDeEFVV 224 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6857> which encodes the amino acid 
sequence <SEQ ID 6858>. Analysis of this protein sequence reveals the following: 

15 Possible site: 57 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -0.69 Transmembrane 170 - 186 ( 170 - 186) 

20 Final Results 

bacterial membrane — Certainty=0. 1277 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0.0000 (Not Clear) < suco 

25 The protein has homology with the following sequences in the databases: 

>GP:CAA99510 GB:Z75191 ORF Y0R283W [Saccharomyces cerevisiae] 
Identities = 64/231 (27%) , Positives = 98/231 (41%) , Gaps = 27/231 (11%) 

Query: 5 RLYIARHGKTMENTIGRAQGWSDTPLTKKGEEGIRELGLGLKDATIPFKAAFSSDSGRTM 64 
30 RL+I RHG+T N QG DT + GEE +LG L+ IF SSD R 

Sbjct: 18 RLFIIRHGQTEHNVKKILQGHKDTSINPTGEEQATKLGHYLRSRGIHFDKVVSSDLKRCR 77 

Query: 65 QTIEIILRESENEFLPYTKDNRIREWCFGSLEGTYDSELFLGVLPRTKAFENRDNLRDVP 124 
QT ++L+ S+ E +P + + +RE G +EG +E 
35 Sbjct: 78 QTTALVLKHSKQENVPTSYTSGLRERYMGVIEGMQITEA 116 

Query: 125 YSELAESIVEVDTANWAEPWEVLRKRIWEGFEAIALSIQNAGGGNALWSHGMTIGTFL- 183 

+ A+ E N+ E + R+ E NGN +VSHG I L 

Sbjct: 117 -EKYADKHGEGSFRNFGEKSDDFVaRLTGCVEEEVAEASNEGVKNLALVSHGGAIRMILQ 175 

40 

Query: 184 WL--IDPDRDKiQYIDNGSVTWEF--DDGQFTIKTIGDMSYRYRGREIIEE 230 

WL + K + N SVT+V++ D QF ++ +G+ + G ++ + 
Sbjct: 176 WLKYENHQAHKIIVFNTSVTIVDYVKDSKjQFIVRRVGNTGHLGDGEFVVSD 226 

45 An alignment of the GAS and GBS proteins is shown below. 

Identities = 150/231 (64%) , Positives = 182/231 (77%) , Gaps = 5/231 (2%) 

Query: 1 MSKVRLYIARHGKTMFNTIGRAQGWSDTPLTTFGELGIKELGLGLKASNISFKEAFSSDS 60 
M+K RLYIARHGKTMFNTIGRAQGWSDTPLT GE GI+ELGLGLK + I FK AFSSDS 
50 Sbjct: 1 MTKTRLYIAEHGKTMFNTIGRAQGWSDTPLTKKGEEGIRELGLGLKDATIPFKAAFSSDS 60 

Query: 61 GRTLQTMEIILREVQQENIPYTRDKRIREWCFGSLDGGYDGDLFNGVLPRV SNGDM 116 

GRT+QT+EIILRE + E +PYT+D RIREWCFGSL+G YD +LF GVLPR + ++ 

Sbjct: 61 GRTMQTIEIILRESENEFLPYTKDNRIREWCFGSLEGTYDSELFLGVLPRTKAFENRDNL 120 



55 



Query: 117 SHLTHEEIANLICQVDTAGWAEPWAILSNRILSGFTAIAKKIEDIGGGNAIWSHGMTIA 176 

+ + E+A I +VDTA MAEPW +L RI GF AIA I++ GGGNA+WSHGMTI 
Sbjct: 121 RDVPYSELRESIVEVDTANWAEPWEVLRKRIWEGFEAIALSIQNAGGGNALVVSHGMTIG 180 



60 Query: 177 TFLWLIDHSTPRSLGLDNGSVSWDFEDGTFSIQSIGDMSYREKGREILEK 227 

TFLWLID + +DNGSV+W+F+DG F+I++IGDMSYR +GREI+E+ 
Sbjct: 181 TFLWLIDPDRDKQY-IDNGSVTWEFDDGQFTIKTIGDMSYRYRGREIIEE 230 
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A related GBS gene <SEQ ID 8999> and protein <SEQ ID 9000> were also identified. Analysis , of this 
protein sequence reveals the following: 

Cytoplasmic predicted but experimentally found on the surface of Streptococci 
32.3/52.0% over 184aa 

Hiermotoga tnaritima 
EGftD| 165681 1 phosphoglycerate mutase Insert characterized 

6P|4981935|gb|AAD36444.l|AE001791_6|AE001791 phosphoglycerate mutase Insert characterized 
PIR|G72260|G72260 phosphoglycerate mutase - (strain MSB8) Insert characterized 

ORF01265(268 - 870 of 1248) 

EGAD|165681|tm1374 (1 - 185 of 201) phosphoglycerate mutase {Thermotoga maritima} 
GP|4981935|gb|AAD36444.l|AE001791_6|AE001791 phosphoglycerate mutase {Thermotoga tnaritima} 
PIR|G72260 |G72260 phosphoglycerate mutase - Thermotoga maritima (strain MSB8) 
%Match =6.3 

%Identity =32.2 %Similarity =52.0 

Matches = 57 Mismatches = 78 Conservative Sxib.s = 35 

105 135 165 195 225 255 285 315 

R6RiraSYEIENPFSMLLKRIlTOFYFCSR*LQNFFIGKVR*YIPVKAFVFCYNIIKCL*GVSMSKTO 

tlhl-l 
MKLYLIRHGETIWNEK 
10 

345 375 405 435 465 495 519 549 

GRAQGWSDTPLTTFGELG1KELGLGLKASNISFKEAFSSDSXRTLQTMEIILREVQQEMI--PYTRDKRIREWCFGSLDG 



GLWQGVTDVPtNERGREQRRKLaNSLK RVDAIYSSPLKRSLETAEEIARRFEKEI IVEEDLRECEISLW - - 

30 40 50 60 70 80 

579 609 639 669 699 729 759 

GYDGDIiFNGVLPRVSNGDMSHLTHEEIANLICQVDTAGWA EPVIAILSNRILSGFTAIAKKIEDIGGGNAI 



NGLTVEE-AIREYPVEFKKWSSDPNFGMEGLESMRNVQHRVVKAIMKIVSQEKLNGSENVV 

90 100 110 120 130 140 

789 816 840 870 900 930 960 990 

WSHGMTIATFL-WLIDHST--PRSLGIJDNGSVSVVDFEDGTFSIQSIGDMSYREKBREILEKrLQ*KKIKLSDSV*LVF 
:||| ::: \: |:: \ : : \\\ |:|||: | : :| 

IVSHSLSLRAFICWILGLPLYLHRNFKLDNASLSVVEIESKPRLVIiiJSIDTCHLKES 
160 170 180 190 200 

SEQ ID 9000 (GBS44) was e3q)ressed in E.coli as a His-fusion product. SDS-PAGE analysis of total cell 
extract is shown in Figure 4 (lane 6; MW 27kDa), in Figure 168 (lane 8-10; MW 42kDa - thioredoxin 
fusion) and in Figure 238 (lane 7; MW 42kDa). It was also expressed in E.coli as a GST-fiision product. 
SDS-PAGE analysis of total cell extract is shown in Figure 12 (lane 8; MW 52.4kDa). 

Purified Thio-GBS44-His is shown in Figure 244, lanes 7 & 8. 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2223 

A DNA sequence (GBSx2342) was identified in S.agalactiae <SEQ ID 6859> which encodes the amino 
acid sequence <SEQ ID 6860>. This protein is predicted to be d-alanyl-d-alanine carboxypeptidase. 
Analysis of this protein sequence reveals the following: 

Possible site: 27 

»> Seems to have a cleavable N-term signal seq. 



Final Results 
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bacterial outside Certainty=0. 3000 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

5 The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAD00280 GB:U78599 putative D,D-carboxypeptidase [Streptococcus rautans] 
Identities = 108/169 (63%) , Positives = 139/169 (81%) 

Query: 79 ELSPDWPVENIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNSYVTQEM 138 
10 E++P0V ++ + +D RI + +FL AA+ IDS EHIiIS6YRSVAYQE+L+N+Y+ QE 

Sbjct: 4 EMNPDVTDIIXSVKVDSRIAEimiKFLAAftQEIDSSEHLISGYRSVRYQEELYlItt^ 63 

Query: 139 TSNPNLTRGQAEKLVKTYSQPAGASEHQTGLRMDMSTVDSLNESDPRWSQLKKIAPQYG 198 
+NP+L++ +A+K V+TYSQP G+SEHQTGLA+DMSTVDSLN+SD W+++ lAP+YG 
15 Sbjct: 64 MTOPSLSQEEAQKQVQTYSQPPGSSEHQTGrAIDMSTVDSmQSDftNWAKVRAIAPKYG 123 

Query: 199 FVLRFPDGKTAETGVGYEDWHYRYVGVESAKYMAKHHLTLEEYITIjLKE 247 

FVLRFP+GK TG+ YEDWHYRYVGV+SAKYM KH LTLEEY+ LKE 
Sbjct: 124 FVISFPEGKKnaTGIDYEDWHYRYVGVKSAKYlWKHDLTLEEYIiKKLKE 172 

20 

A related DNA sequence was identified in S.pyogenes <SEQ ID 686 1> which encodes the amino acid 
sequence <SEQ ID 6862>. Analysis of this protein sequence reveals the following: 

Possible site: 26 
»> Seems to have an uncleavable N-term signal seq 
25 IMTEGRAL Likelihood = -9.66 Transmembrane 10 - 26 ( 3 - 29) 

Final Results 

bacterial membrane Certainty=0. 4864 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

30 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the databases: 

>GP:AAD00280 GB:U78599 putative D, D-carboxypeptidase [Streptococcus mutans] 

Identities = 118/173 (68%) , Positives = 139/173 (80%) 

Query: 74 ITKEMSPEIADINGISVDKRIEQATSDFLAAAQAIDLQEHLISGYRSVDYQTELYQSYIK 133 

IT EM+P++ DI+G+ VD RI + T FLRAAQ ID EHLISGYRSV YQ ELY +YI 
Sbjct: 1 ITAEMNPDVTDIDGVKTOSRIAENTRKFLAAaQEIDSSEHLISGYESVaYQEELYNNYIA 60 

40 Query: 134 KEMANDPTLTQEAAEALVQTYSQPPGASEHHTGLAIDMSTVDTLNASDPSVAKAVQKIAP 193 

+E AN+P+L+QE A+ VffTYSQPPG+SEH TGLAIDMSTVD+LN SD +V V lAP 
Sbjct: 61 QEKZaSlNPSLSQEEAQKQVQTYSQPPGSSEHQTGLAIDMSTVDSLNQSnftlISA«UWaA 120 

Query: 194 DYGFVLRFPEGKKTSTGVDYEDWHYRYVGKASARYMAQHNLTLEEYIAALKEK 246 
45 YGFVLRFPEGKK +TG+DYEDWHYRYVG SA+YM +H+LTLEEY+ LKEK 

Sbjct: 121 KYGFVIJlFPEGKKiaTGIDYEDWHYRYVGVKSAKYOTKHDLTLEEYLKKLKEK 173 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 131/235 (55%) , Positives = 172/235 (72%) , Gaps = 3/235 (1%) 

50 

Query: 15 LLA1LCF--SLFALLKPNSQQSSSQKLRNEDIKKISSQKRNKKLQLPAVSSKDWNLILVN 72 

LL ++ F L+ +KP + +Q L ++I++ +K ++ LP VS +DW L+LVN 
Sbjct: 12 lilVIVFI/SGLYLFIKPEESOTPTQ-USIKKEIQQKDIKItTDRLRALPKVSVEDWELVLVN 70 

55 Query: 73 RDHKHEELSPDWPVENIYLDKRITKQATQFLEAARAIDSREHLISGYRSVAYQEKLFNS 132 

RDH +E+SP++ + I +DKRI + + FL AA+AID +EHLISGYRSV YQ +L+ S 
Sbjct: 71 RDHITKEMSPELftDINGISVDKRIEQATSDFLAAAQAIDLQEHLISGYRSVDYQTELYQS 130 

Query: 133 YVTQEMTSNPNLTRGQAEKLVKTYSQPAGASEHQTGLftMDMSTVDStNESDPRVVSQLKK 192 
60 Y+ +EM ++P LT+ AE LV+TYSQP GASEH TGLA+DMSTVD+LN SDP V ++K 

Sbjct: 131 YIKKEMftNDPTLTQEAAEALVQTYSQPPGASEHHTGLAIDMSTVDTLNASDPSVAKAVQK 190 



35 



Query: 193 lAPQYGFVLRFPDGKTAETGVGYEDWHYRYVGVESAKYMAKHHLTLEEYITLLKE 247 
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lAP YGFVLRFP+GK TGV YEDWHYRYVG SA+YMA+H+LTLEEYI LKE 
Sbjct: 191 lAPDYGFVljRFPEGKKTSTGVDYEDWirmYVGKASARYMAQHlSrDTLEEYIJ^ 245 

A related GBS gene <SEQ ID 9001> and protein <SEQ ID 9002> were also identified. Analysis of this 
protein sequence reveals the following: 

Lipop: Possible site: -1 Crend: 7 
McG: Discrim Score: 14.03 
GvH: Signal Score (-7.5): -1.02 

Possible site: 27 
»> Seems to have a cleavable N-term signal seq. 
ALOM program count: 0 value: 10.08 threshold: 0.0 
PERIPHERAL Likelihood = 10.08 56 
modified ALOM score: -2.52 

*** Reasoning Step: 3 

Final Results 

bacterial outside — Certainty=0. 3000 (Affirmative) < suco 
bacterial mentorane — Certainty=0 . 0000 (Not Clear) < suco 
bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the databases: 

33.7/55.1% over 183aa 

Enterococcus faecal is 

EGRD|41322| d-alanyl-d-alanine carboxypeptidase Insert characterized 
GP|l209528|gb|AAB05624.l| |U353e9 D,D-carboxypeptidase Insert characterized 

ORF01266(484 - 1038 of 1350) 

EGAD141322 j43646 (85 - 268 Of 268) d-alanyl-d-alanine carboxypeptidase {Enterococcus 
faecalis}SP|Q47746|VANY_ENTFA D-ALANYL-D-ALANINE 

CARBOXYPEPTIDASE (EC 3.4.16.4) (DD-PEPTIDASE) (DD- 

CARBOXYPEPTIDASE) .GP| 1209528 1 gb 1AAB05624 . 1 1 |U35369 D,D- carboxypeptidase {Enteroco 
ecus faecalis} 
%Match =10.1 

%Identity = 33.7 %Siinilarity =55.1 

Matches = 63 Mismatches = 79 Conservative Sub.s = 40 

234 264 294 324 354 384 414 444 

SR*F*RWNIFYSIYWGYVLSRKRKRNFRKNIAMKKNKIIRFSLVGVLLAILCFSLFALLKENSQQSSSQKLRNEDIKKIS 

MEKSNYHSNVNHHKRHMKQSGEKRAFLWAFIISFTVCTLFIK3WRLVSVLEATQLPPIPATHTGSGTG\ffiEN 
10 20 30 40 50 60 70 

474 504 531 561 588 618 648 678 

SQKRNKKLQLPAVSSKDVmilLVNRDHK-HEELSPDVVPVEN-IYLDKRITKQATQFLEAftRAIDSREHLISGYRSVAYQ 
::1:|||||| : : :: : | :| ||: :::|||1 : ||||: | 

PEENTLATAKEQGDEQEWSLIL^^roQNPIPAQYDVELEQLSN6ERIDIRISPYLQDLFDAARADGVYPIYAS6YRTTEKQ 
90 100 110 120 130 140 150 

708 738 768 798 828 858 888 918 

EKLETJSYVTQEMTSNPI^TRGQAEKLVKTYSQPAGASEHQTGLAMDMSTVDSLNESDPRWSQLKKIAPQYGPVLRFPDG 
::: : I I : | ||: :|: | |||| ||1:|:: | :: : | | : : ::||: |:| 

QEIMDEKV-AEYKAK-SYTSAQAICAEAETWVAVPGTSEHQLGLAVDINA-DGIHSTtSNEVYRWI^ 

160 170 180 190 200 210 220 

948 978 1008 1038 1068 1098 1128 1158 

KTAETGVGYEDWHYRYVGVESAKYMAKHHLTLEEYITLLKENNQ*GNVFPC*ILLLLLLFSFSLFFFRF*TIREK*MLIL 

II III I 1111111=1=1 = : I 111!: I 
KTEITGVSNEPWHYRYVGIEAATKIYHQGLCLEEYIJSrrEK 



SEQ ID 6860 (GBS 18) was ejcpressed in Kcoli as a His-fusion product. SDS-PAGE analysis of total cell 
extract is shown in Figure 4 (lane 3; MW 31kDa). 
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The.GBS18-His fusion product was purified (Figure 93 A; see also Figure 189, lane 11) and used to 
immunise mice (lane 2 product; 2.0fig/naouse). The resulting antiserum was used for Western blot (Figure 
93B), FACS (Figure 93C ), and in the in vivo passive protection assay (Table III). These tests confirm that 
the protein is immunoaccessible on GBS bacteria and that it is an effective protective immunogen. 

Example 2224 

A DNA sequence (GBSx2343) was identified in S.agalactiae <SEQ ID 6863> which encodes the amino 
acid sequence <SEQ ID 6864>. This protein is predicted to be uimamed protein product. Analysis of this 
protein sequence reveals the following: 

Possible site: 34 

»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood =-12.58 Transmembrane 10 - 26 ( 3 - 29) 



Final Results 

bacterial membrane Certainty=0 . 6031 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6865> which encodes the amino acid 
sequence <SEQ ID 6866>. Analysis of this protein sequence reveals the following: 

Possible site: 33 
»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood =-11.83 Transmembrane 10 - 26 ( 4 - 33) 

Final Results 

bacterial meiribrane Certaintyi=0. 5734 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the- databases: 

>GP:AAD00279 GB:U78599 putative N-acetyl-muramidase [Streptococcus mutans] 
Identities = 66/150 (44%) , Positives = 97/150 (64%) , Gaps = 5/150 (3%) 

Query: 18 LLLIVCPLLSSQRIASADKEVRVNYSQKQFITKMGKEVKPLAKYYGIRPSILIAQILLET 77 

LL+I+ P+L+S +A A+K++ YS K+F+ ++ + Ii+K YG+R SI+I Q L++ 
Sbjct: 3 LLVILLPILASGGLAnaNKKMPSPYSHKEFVKEIAPTAQKLSKIYGVRSSIIIGQAALDS 62 

Query: 78 HTOKTLLASKYHNLFSKKATPGQVAITLKSPKQTN- - -QNV- -RYAIYJCDDASAIRDYLR 132 

H G TLLASKXHNLFS +A+PGQ A+ LKS + N Q V RY +Y+ ++ DY+ 
Sbjct: 63 HPGSTLLASKYHNLFSIEASPGQGAVRLKSHEYKNGRWQEVTNRYLVYESWKESLYDYMA 122 

Query: 133 MLRQGKEVDKRLYRNLATEKGYKAPAKSLQ 162 

+L K DK LY + T GYK A++LQ 
Sbjct: 123 ILHGNKIWDKALYTTMMTSSGYKTVAHALQ 152 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 67/190 (35%) , Positives = 102/190 (53%) , Gaps = 1/190 (0%) 

Query: 1 MRKRFSLLNFIWTFIFFFFILFPLtNHKGKVDANSRQSVTYTKEEFIQKIVPDAQDLGK 60 

MRKR F+ + F 1+ PLL+ + A+ V Y++++FI K+ + + L K 

Sbjct: 1 MRKRLKFPYFLTLLACPLLLIVCPLLSSQRIASADKEVRVNYSQKQFITKMGKEVKPLAK 60 

Query: 61 SYGIRPSFIIAQAALDSDFGEKILANKYHlSrLFGLLAEPGTPSITLNDSSTGKKQEKQFTH 120 

YGIRPS +IAQ L++ G+ +LA+KYHNLF A PG +ITL S Q ++ 

Sbjct: 61 YYGIRPSILIAQILLETHDGKTLLASKYHNLFSKKATPGQVAITLK-SPKQTNQNVRYAI 119 

Query: 121 YKSWKYSMYDYLAHIKSGATGKJOJSYTIWSVKNPKTLVQKLQDSGFDNDKKYAKKMTEI 180 
YK ++ DYL ++G KY+ + KK +LQ DK YA+++ ++ 
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Sbjct: 120 YKDDASAlimmMLRQGKENTDKEai'TOI^^ 179 

Quesry: 181 IDLYDLTRTO 190 

1+ DLT YD 
Sbjct: 180 lESNDLTNYD 189 

SEQ ID 6864 (GBS246) was expressed in E.coli as a His-fusion product. SDS-PAGE analysis of total cell 
extract is shown in Figure 61 (lane 7; MW 24.6kDa). 

GBS246d was expressed in E.coli as a His-fusion product. SDS-PAGE analysis of total cell extract is 
shown in Figure 154 (lanes 14 & 15; MW 21kDa) and in Figure 183 (lane 4; MW 21kDa). It was also 
e)q)ressed in E.coli as a GST-fusion product. SDS-PAGE analysis of total cell extract is shown in Figure 
187 Qme 12; MW 46kDa). Purified GBS246d-GST is shown in Figure 243, lane 12. 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2225 

A DNA sequence (GBSx2344) was identified in S.agalactiae <SEQ ID 6867> which encodes the amino 
acid sequence <:SEQ ID 6868>. Analysis of this protein sequence reveals the following: 

Possible site: 44 

>» Seems to have no IT-tertnincil signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 2541 (Affirmative) < suco 

bacterial membrane Certainty=0,OOQO (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. ' 

>GP:AAC45610 GB;'D78296 repressor of class 1 heat shock gene 

expression HrcA [Streptococcus tnutans] 
Identities = 227/345 (65%) , Positives = 287/345 (82%) , Gaps = 1/345 (0%) 

Query: 17 VITQRQITOItNLIVELPTQTHEPVGSKALQRTIDSSSaTIRmilAKLEKLGLLEKftHTSS 76 

+ITQRQ DILNLIVELPT+TiaEP+GSK LQ +1 SS ATIFisiDMA LEKLGLLEKA T 
Sbjct: 1 MITQRQKDIUn,IVELFTKTHEPIGSB?rr.QNS3JlSSRATIR3snmaia^ 60 

Query: 77 GRM-PSPAGFKYFVEHSLRLDSIDEQDIYHVIKAPDFEAFKLEDMLQKASHILSEMTGYT 135 

+ P +YFVEHSL PS+DEQD+Y VIKAFDFEAF+L D+LQ+iiS +TGYT 

Sbjct: 61 AVVCPVKKAIRYPVEHSLNPDSIJJEQDVYQVIKAPDFEAPRLGDIJjQRASDVIiaia 120 

Query: 136 SVIIJJVEPARQRLTGPDVVQLSNHnaIAV^raBESKPVTVQPAIPRNFLTRDLIAFKAIV 195 

++II1DVEP +QRLT FD+V+LSNHDAIAV+TI1DE+ PVTVQPAIP+NFIi DL+ I 
Sbjct: 121 ALlII(\mPKBaRLTTPDIVKIiSNHDAIiAVLTIimSE^ 180 

Query: 196 EERLI.DGSVMDIHYKLRTEIPQIVQKYFVTTDNVLQLFDYVFSELFLETVFVAGIOTNSLT 255 

ER L+ +V+DIHY+LRTE PQI+QKYF TDNVL LFD++F+ +F E VF+4GK+ +L 
Sbjct: 181 RERFLNQTVLDIHYRLRTEPPQIIQKYFPRTDNVLDLFDHIFNPIFQEEVFISGKIKTLE 240 

Query: 256 YSDLSTYQFLDNEQQVaiSLRQSLKEGEMASVQVADSQEAiUiaDVSVLTHKFLIPYRGPG 315 , 

++ L TYQPL+H Q VA+ +RQSL E E+ VQVADS+E +I1KD++V++ KPIilFYRGFG 
Sbjct: 241 FAGLDTYQPIiElsniQSWAE^IIRQSLPEDELHRVQVJiDSKEKSIjajLTVISQKFL^ 300 

Query: 316 LLSLIGPIDMDYRRSVSLVNIIGKVLAAKLGDYYRYLNSNHYEVH 360 

+L++IGP+D+DY+R++SL+N4-I +VLA KliGD+YRYLNSNHYEVH 
Sbjct: 301 ILWIGPVDI^YQRTISLIWISRVIAVKLGDPYRYIiNSNHYEVH 345 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6869> which encodes the amino acid 
sequence <SEQ ID 6870>. Analysis of this protein sequence reveals the following: 
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Possible site: 28 

»> Seems to have no N-terminal signal sequence 



Final Results 

5 bacterial cytoplasm — Certainty=0. 0695 (Affirmative) < suco 

bacterial membrane — Certainty^O . 0000 (Not Clear) < suco 

bacterial outside, — Certainty=0. 0000 (Not Clear) < suco 

An aligranent of the GAS and GBS proteins is shown below. 

10 Identities = 341/344 (99%) , Positives = 343/344 (99%) 

Query: 17 VITQRQNDILNLIVELFTQTHEPVGSKALQRTIDSSSATIRNDMAKLEKLGLLEKAHTSS 76 

VITQRQNDILNLIVELFTQTHEPVGSKALQRTIDSSSATIRNDMAKLEKLGLLEKAHTSS 
Sbjct: 1 VITQRQNDIimilVELFTQTHEPVGSKALQRTIDSSSATlRNDMAKLEKLGLLEKAHTSS 60 

15 

Query: 77 GRMPSPAGFKyFVEHSLRIiDSIDEQDIYHVIKAFDFEAFKLEDMLQKASHILSEMTGyTS 136 

GRMPSPAGFKYFVEHSLRLDSIDEQDIYHWKAFDFEAFKI^DMLQKRSHIL+EMTCSYTS 
Sbjct: 61 GRMPSPAGFKYFVEHSLRLDSIDEQDIYEIVIKaFDFEAFKLEDMLQKASHILAEMTGYTS 120 

20 Query: 137 VILDVEPARQRLTGFDWQLSNHDALAVMTLDESKPVTVQFAIPRNFLTRDLIAFKAIVE 196 

VILDVEPARQRLTGFDWQLSNHDALAVMTLDESKPVTVQFAIPRNFLTRDLIAFKAIVE 
Sbjct: 121 VILDVEPARQRLTGFDWQLSNHDALAVMTIJDESKPVTVQFAIPRNFLTRDLIAFKaiVE 180 

Query: 197 ERLLDGSVmiHYKLRTEIPQIVQKXPOTTDNVLQLFDYVFSELFLETVFTOGKVNSLTY 256 
25 ERLLD SV+DIima^TEIPQIVQKYFVTTONVLQLFDYVFSELFIjETVFVaGKVNSLTY 

Sbjct: 181 ERLLDNSVIDIHYKLRTEIPQIVQKyPVTTDNVLQLFDYVFSELFLETVFVAGKVNSLTY 240 

Query: 257 SDLSTYQFLDNEQQVAISLRQSLKEGEMASVQVADSQEAALADVSVLTHKFLIPYRGFGL 316 
SDLSTYQFLDNEQQVAISLRQSLKEGEMASVQVADSQEAALADVSVLTHKFLIPYRGFGL 
30 Sbjct: 241 SDLSTYQFLDNEQQVAISLRQSLKEGEMASVQVaDSQEAALADVSVLTHKFLIPYRGFGL 300 

Query: 317 LSLIGPIDMDYRRSVSLVNIIGKVLAAKLGDYYRYXJJSNHYEVH 360 

LSLIGPIDMDYRRSVSLVNIIGKVLAAKLGDYYRYIiNSNHYEVH 
Sbjct: 301 LSLIGPIDMDYRRSVSLWIIGKVLAAKLGDYYRYLNSNHYEVH 344 

35 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2226 

A DNA sequence (GBSx2345) was identified in S.agalactiae <SEQ ID 6871> which encodes the amino 
40 acid sequence <SEQ ID 6872>. This protein is predicted to be grpe protein (gtpE). Analysis of this protein 
sequence reveals the following: 

Possible site: 15 

>» Seems to have no N-terminal signal sequence 

45 Pinal Results 

bacterial cytoplasm — Certainty^O. 5138 (Affirmative) < suco 
bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

50 The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAC45611 GB:U78296 GrpE [Streptococcus mutans] 
Identities = 130/180 (72%) , Positives = 151/180 (83%) , Gaps = 3/180 (1%) 

Query: 14 VSEEIKKDDLQEEVEATE--TEETVEEVIEEIPEKSELELANERADEFENKYLRAHAEM- 70 
55 +S++ KK++ +EEVEATE TEE+VEEV EE E EL+ A ERA++FENKYLRAHAEM 

Sbjct: 1 MSKKDKKEEYKEEVEATEPTTEESVEEVAEETSENKELQEALERAEDFENKYLRAHAEMP 60 

Query: 71 QNIQRRSSEERQQLQRYRSQDLAKAILPSUDNLERAIAVEGLTDDVKKGLEMTRDSLIQA 130 
+ + + QRYRSQDL KAILPSLDNLERALAVEGLTDDVKKGLEM ++SLIQA 

60 Sbjct: 61 KTFSVAIMCSDKVCQRYRSQDLRKAILPSLDNLERALAVEGLTDDVKKGLEMVQESLIQA 120 
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Query: 131 LKEEGVEEVEVDSFDHNFHMAVQTIiPADDEHPADSIAEVFQKGYKLHERLLRPAMWVYN 190 

LKEEGVEEVE+++FD N HMaVQTL ADD+HPADSIA+V QKGy+LHERLDRPA^WVVYN 
Sbjct: 121 LKEEGVEETOLENFDAKtiHmVQTLDADDDHPADSIAQVHQKGYQLHERL^ 180 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6873> which encodes the amino acid 
sequence <SEQ ID 6874>. Analysis of this protein sequence reveals the following: 

Possible site: 15 

>» Seems to have no N-terrainal signal sequence 



Final Results 

bacterial cytoplasm Certaintyi=0. 5138 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 189/190 (99%) , Positives = 189/190 (99%) 

Query: 1 MAVFNKLFKRRHSVSEEIKKDDLQEEVEATETEETVEEVIEEIPEKSELELANERADEFE 60 

20 MAVFNKLFKRRHSVSEEIKKDDLQEEVEATETEETVEEVIEE PEKSELEIANERADEFE 

Sbjct: 1 MAVFNKLFKRRHSVSEEIKKDDLQEEVEATETEETVEEVIEETPEKSELELAMERADEFE 60 

Query: 61 NKXLRAHREMCJNIQRRSSEERQQLQRYRSQDIAKAILPSI^LERALAVEGLTDDVKKGL 120 
NKXIiRAHAEMQNIQRRSSEERCffiiiQRYRSQDIiftKAILPSLimERaLAVEGLTDDVI^ 
25 Sbjct: 61 NKYLRftHAEMQNIQRRSSEERQQLQRYRSQDLAKAILPSIiDNLERAIAVEGLTDDVKKGIi 120 

Query: 121 EMTRDSLIQALKEEGVEEVEVDSFDHNFHMAVQTLPADDEHPADSIAEVFQKGYKLHERL 180 

EMTRDSLIQALKEEGVEEVEVDSPDHNFHMAVQTLPADDEHPADSIAEVFQKGYKLHERL 
Sbjct: 121 EMTRDSLIQALKEEGVEEVEVDSPDHNFHMAVQTLPADDEHPADSIAEVFQKGYKLHERL 180 

30 

Query: 181 LRPAMVWYN 190 

LRPAMWVYN 
Sbjct: 181 LRPAMWVYN 190 

35 Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2227 

A DNA sequence (GBSx2346) was identified in S.agalactiae <SEQ ID 6875> which encodes the amino 
acid sequence <SEQ ID 6876>. This protein is predicted to be heat shock protein 70 (dnaK). Analysis of 
40 this protein sequence reveals the following: 

Possible site: 17 

>>> Seems to have no N-terminal signal sequence 

Final Results 

45 bacterial cytoplasm Certainty=0 . 0996 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6877> which encodes the amino acid 
50 sequence <SEQ ID 6878>. Analysis of this protein sequence reveals the following: 

Possible site: 17 

»> Seems to have no N-terminal signal sequence 

Final Results 

55 bacterial cytoplasm Certainty=0 . 0996 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 
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An alignment of the GAS and GBS proteins is shown below. 

Identities = 594/609 (97%), Positives = 603/509 (98%), Gaps = 1/609 (0%) 

Query: 1 MSKIIGIDICTTNSAVAVLEGTESKIIMTPECailRTTPSWSFjaSIGEIIVGDAaKRQ^ 60 
5 MSKIIGIDMTTWSATOVLEGTESKIIAOTEGNRTTPSWSPKNGEIIVGDaaKRQAV^ 

Sbjct: 1 MSKIIGIDLGTTOSAVAVLEGTESKIIANPEGlTOTTPSWSFKNGEIIVGDAMRQftVTN 60 

Query: 61 PDTVISIKSKMGTSEKVSANGKEYTPQEISAMILQYLKGYAEDYLGEKVEKAVITVPAYF 120 
P+TVXSIKSKMGTSEKVSJU)IGKEYTPQEISAMILQYLKGYftEDYLGEKVEKAVITVPAYF 
10 Sbjct: 61 PEWISIKSKMGTSEWSANGKEYTPQEISAMILQYLKGYAEDYLGEKVEKAVITVPAYF 120 

Query: 121 NnAQRQATKiaGKIAGLEVERI\raEPTAAALAVGmKTDKDEKILYFDLGGG^ 180 

NDAQRQATKDAGKIAGLEVERIVNEPTAAAIAYGMDKTDKDEKILVFDLGGGTF0VSILE 
Sbjct: 121 NDAQRQATKDAGKIAGLEVERIVNEPTAAALAYGMDKTDKDEKILVFDLGGGTFDVSILE 180 

15 

Query: 181 LGDGVFDVLATAGDNKLGGDDFDQKIIDFLVEEFKKEaSfGIDLSQDKMRLQRLKDAAEKAK 240 

LGDGVFDVLATAGDNKLGGDDFDQKIIDFLV EFKKENGIDLSQDKMALQRLKDAAEKAK 
Sbjct: 181 LGDGVFDVLATAGDNKLGGDDFDQKIIDFLVAEFKKENGIDLSQDKMALQRLKDAAEKZ^ 240 

20 Query: 241 KDLSGVTQTQISLPFITAGSAGPLHLEMSLSRAKFDDLTRDLVERTKTPVRQALSDAGLS 300 

KDLSGVTQTQISLPFITAGSAGPtHLEMSLSRAKFDDLTRDLVERTKTPVRQALSDAGLS 
Sbjct: 241 KDLSGVTQTQISLPFITAGSASPLHLEMSLSRAKFDDLTRDLVERTKTPVRQMiSnaGLS 300 

Query: 301 LSEIDEVILVGGSTRIPAWEAVKRETGKEPNKSVNPDEWAMGAAIQGGVITGDVKDW 360 
25 LSEXDEVILVGGSTRIPAWEAVKAETGKEPNKSVNPDEWAMGAAIQGGVITGDVKDW 

Sbjct: 301 LSEIDEVILVGGSTRIPAVVEAVKAETGKEENKSVNPDEWAMGAAIQGGVITGDVKDW 360 

Query: 361 LLDOTPLSLGIETMGGVFTKLIDRNTTIPTSKSQVFSTAADNQPAVDIHVLQGERPlViaAD 420 
LIJDVTPLSLGIETMGGVFTKLIDRNTTIPTSKSQVFSTAADNQPAVDIHVLQGERPMAAD 
30 Sbjct: 361 LLDVTPLSLGIETMGGVFTKLIDROTTIPTSKSQVFSTAADNQPAVDIHVLQGERPMAAD 420 

Query: 421 NKTLGRFQLTDIPAAPRGIPQIEVTFDIDKNGIVSVKAKDLGTQKEQHIVIQSNSGLTDE 480 

NKTLGRFQLTDIPAAPRGIPQIEVTFDIDKNGIVSVKAKDLGTQKEQHIVI+SN GL++E 
Sbjct: 421 NKTLGRFQLTDIPAAPRGIPQIEVTFDIDKN6IVSVKAKDLGTQKEQH1VIKSNDGLSEE 480 

35 

Query: 481 EIDKMMKDAEANAEADAKRKEEVDLKNEVDQAIFATEKTIKETEGKGFDTERI^ 540 

EID+MMKnAEANAEADAKRKEEVDLKNEVDQAIFATEKTIKETEGKBFDTERDA^ 
Sbjct: 481 EIDRMMKDAEANAEADAKRKEEVDLKNEVDQAIFATEKTIKETEGKGFIXrERI^^ 540 

40 Query: 541 ELKKAQESGNLDDMKAKLEALNEKAQALAVKLYEQAAAAQQAAQGAEGAQSADSSSKGDD 600 

ELK AQESGNLDDMKAKLEALNEKAQALAVK+YEQAAAAQQaAQGAEGAQ+ DS++ DD 
Sbjct: 541 ELKAAQBSGIILDDMKAKLE2pIEKAQAIlAVKMYEQAAAAQQ%AQQAEG^^3A]ro 599 

Query: 601 WDGEFTEK 609 
45 WDGEFTEK 

Sbjct: 600 WDGEFTEK 608 

Based on this analysis, it was predicted that these proteins and their epitopes could be usefiil antigens for 
vaccines or diagnostics. 

50 Example 2228 

A DNA sequence (GBSx2347) was identified in S.agalactiae <SEQ ID 6879> which encodes the amino 
acid sequence <SEQ ID 6880>. This protein is predicted to be Streptococcus pneumoniae DnaJ protein 
homologue (dnaJ). Analysis of this protein sequence reveals the following: 

Possible site: 18 
55 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 4180 (Affirmative) < suco 

bacterial membrane Certaintyi=0 . 0000 {Not Clear) < suco 

60 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 
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A related DNA sequence was identified in S.pyogenes <SEQ ID 688 1> which encodes the amino acid 
sequence <SEQ ID 6882>. Analysis of this protein sequence reveals the following: 

Possible site: 16 ' • 

>» Seems to have no N-terminal signal sequence 



An alignment of the GAS and GBS proteins is shown below. 

Identities = 330/377 (87%) , Positives = 357/377 (94%) , Gaps = 1/377 (0%) 

Query: 1 MmTEFTORLGVSKDASQDEIKKAYRRMSKKXHPDIimiTGREEKyKEVQEA^ 60 

MNNTE+YDRLGVSKDASQD+IKKAYR+MSKKYHPDINKE GAE+KYK+VQEAYETLSD+Q 
Sbjct: 19 MNNTEYYDRLGVSKDASQDDIKKAYRKMSKKYHPDINKEAGAEQKYKDVQEAYETLSDSQ 78 

Query: 61 KRAAYDQYGAAGANGGFGGFDGGGFGGFDGGGFGGFEDIFSSFFGGGGMRNPNAPRQGDD 120 

KRAAYDQYGAAGA. GGFGG GGFG6FDGGGFGGFEDIFSSFFGGGG RNPNAPRQQDD 
Sbjct: 79 KRi^YDQYQAAGS^QGGFGG-GAGGFaSFDGGGFGGFEDlFSSFFGGGGSRISrPNAPRQQDD 137 

Query: 121 LQYRVNLSFEEAIFGAEKEVSYNRESSCHTCSGSGAKPGTSPVTCQKCHGSGVINVDTQT 180 

LQYRVNLSFEEA+FG EKEVSYNRE++C TC GSGAKPGT+PVTC+KCHGSGV+ +DTQT 
Sbjct: 138 LQYRVraJSFEEAVFGVEKEVSYlTOEATCGTCMSGaKPGTAPVTCRKCHGSGVMTIDTQT 197 

Query: 181 PLGTMRRQVTCDVCQGSGQEIKEKCPTCHGTGHEKKTHKVSVKIPAGVETGQQIRLTGQG 240 

PLG MRRQVTCD+C GSG+EIKE C TCHGTGHEK+ HKVSVKIPAGVETGQQIRL GQG 
Sbjct: 198 PLGMMRRQVTCDICHGSGKEIKEPCQTCHGTGHEKQaHKVSVKIPAGVETGQQIRIiQGQG 257 

Query: 241 EAGEMGGPYGDLFVIINVLPSQQFERNGSTIYYTMIISPVQaALGDTIDIPTVHGAVEMS 300 

EAGFNGGPYGDLFVI+NVLPS+QFERNGSTIYY L+ISF QAALGDT++IPTVHG VEM+ 
Sbjct: 258 EAGFNGGPYGDLFVILIWLPSKQFERNGSTIYYNLDISFTQAALGDTVEIPTVHGDVEMA 317 

Query: 301 IPAGTQTGKTFRLRGKGAPKIAGGGQGDQHVTVNIVTPTKIJSIDfiQKEAIJIAFAE^ 360 

IPAGTQTGKTFRL+GKSRPKLRGGGQGDQHVTWIVTPTKIiNI^ AFAEftSG+KM 
Sbjct: 318 IPAGTQTGKTFRLKGKGAPKLRGGGQGDQHVTVNIVTPTmSinAQREALQAFAEASGEK^ 377 

Query: 361 VHPKKKGFFDKVKDALD 377 

+HPKKKGFFDKVKDAL+ 
Sbjct: 378 IiHPKKRGFFDKVKDALE 394 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2229 

A DNA sequence (GBSx2348) was identified in S.agalactiae <SEQ ID 6883> which encodes the amino 
acid sequence <SEQ ID 6884>. Analysis of this protein sequence reveals the following: 

Possible site; 59 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -0.22 Transmembrane 281 - 297 ( 281 - 297) 



Final Results 



bacterial cytoplasm Certainty=0. 1322 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 



Final Results 



bacterial membrane 
bacterial outside 
bacterial cytoplasm 



- Certainty^O. 1086 (Affirmative) < suco 

- Certainty=0. 0000 (Not Clear) < suco 

- Certainty=0 . 0000 (Not Clear) < suco 



The protein has homology with the following sequences in the GENPEPT database. 



>GP:AAD24445 GB:AF118389 unknown [Streptococcus suis] 
Identities = 182/373 (48%) , Positives = 257/373 (68%) , Gaps = 5/373 (1%) 
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Query: 4 KVEEIRSYLIASICJNGKLAPGDRLPSIRQLANQFSCimjWQRVIMIJlFDlSnriYAK^ 63 

K + I ++ 1+ + G++LPSIRQL Q+ C+KDTVQ+ ++EL++ N lYA +S 
Sbjct : 3 KyQVIIQDILTGIEEHRFKRGEKLPSIRQLREQYHCSKDTVQKaMI,ELKYQNKIYAVEKS 62 ' ' ' 

■ 5 Query: 64 GYyVFDSHQEEVEEGVSLENSEIANIAYDDFRLCLNETLIGREDYLFNyYYRQEGLLDLS 123 

GYY+ + + + + ++ I Y+DFR+CL E+LIGRE+YLENYY++QEGL +L 

Sbjct: 63 GYYILEDRDPQ-DHTCaiaQSYRLSRITYEDFRICLKESLIGRENYLFNYYHQQEGLAELI 121 

Query: 124 KAVAKLMEETGVYVPLDDIVITAGTQQALFILTQVTFPNRKSRVLIEEPTYPRMIELIKT 183 
10 +V L+ + VY D +VITAG+QQAL+ILTQ+ K+ +LIE PTY RMIELI+ 

Sbjct: 122 SSVQSLIM3YHVYTKKDQLVITAGSQQALYILTQMETLaGKTEILIENPTYSRMIELIRH 181 

Query: 184 QIILPYETISRGTHGIDFQRLEEIFQTQSIKFFWIPRMHNPLGTSYNPVEMKRLlEMftEK 243 
Q +PY+TI R GID + LE IFQT IKFFY IPR+HNPLG++Y+ ++++A++ 
15 Sbjct: 182 QGIPYQTIERNLDGIDLEELESIFQTGKIKFFYTIPRLHNPLGSTYDIATKTAIVKLAKQ 241 

Query: 244 YDVYIVEDDYMSDFASQS--PLHYYDTHGRVIYLKSFSKAIFPALRIjaAICLPQALKSTF 301 

YDVYI+EDDY++DF S PLHY DT RVIY+KSF+ +FPALR+ AI LP L+ F 
Sbjct: 242 YDVYIIEDDYLADFDSSHSLPLHYLDTDNRVIYIKSFTPTLFPALRIGAISLPMQLRDIF 301 

20 

Query: 302 MAYKKlM)YDimiLQKftLftLYIENGLYAKNSQYLKYRYQKDLaNS^ 360 

+ +K L+DYDTNLI+QKAL+LYI+NG+H-A+N+Q+L + Y K L + N+P Y 

Sbjct: 302 IKHKSLIDYDTlSmiMQKALSLYIDNGMFARNTQHLHHIYHftQWNKIKDCLEKyAM 360 

25 Query: 361 SLHHDSVLFDCSK 373 

+ SV F SK 
Sbjct: 361 RIPKGSVTFQLSK, 373 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6885> which encodes the amino acid 
30 sequence <SEQ ID 6886>. Analysis of this protein sequence reveals the following: 

Possible site: 59 

»> Seems to have no N-terminal signal sequence 

Final Results 

35 bacterial cytoplasm — Certainty=0.3043 (Affiirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

40 Identities = 176/382 (46%) , Positives = 255/382 (66%) , Gaps = 7/382 (1%) 

Query: 1 MVTKVEEIRSYLIASIQNGKLAPGDRLPSIRQLANQFSCNKDTOQRVIiMELRFnNYIYAK 60 

M TK + I S + IQ +L GD+LPSIR L+ + C+KDTVQR L+EL++ + lYA 
Sbjct: 1 MTTKYQTIISNIEQDIQKQRLKKGDKLPSIRVLSKVYYCSKDTVQRALLELKYRHLIYAV 60 

Query: 61 PRSGYYVFDSHQEEVEEGVSLPNSEIANIAYDDFRLCLNETLIGREDYLFNYYYRQEGLL 120 

P+SGYYV + + ++L + N+AY+DFRLCLNE L ++ YLF+YY++ EGL 
Sbjct: 61 PKBGYYVL-GWSMPDNVIOSILSLEDYWNMAYEDFRLCriNE^ 119 

50 Query: 121 DLSKAVRKUffiETGVYVPLDDIVITAGTQQALFILTQVTFPNRKSRVLIEEPTYPRMIEL 180 

+L +A+ + E VY D ++IT+GTQQftL+IL+Q+ FEN +L+E+PTY RM + 
Sbjct: 120 ELREALLLYIAENSVYSNKDQLLITSGTQQALYILSQMPFEimSKTILLEKPTYHRMEAI 179 

Query: 181 IKTQNLPYETISRGTHGIDFQRLEEIFQTQSIKFFYVIPRMHNPLGTSYNPVEMKRLIEM 240 
55 + LPY+TISR +G+D + LE +FQT IKFFY I R +PLG SY+ E + ++ + 

Sbjct: 180 VAQLGLPYQTISRHENGLDLELLESLFQTGDIKFFYTISRFSHPLGLSYSTKEKEAIVRL 239 

Query: 241 AEKYDVYIVEDDYMSDFA--SQSPLHYYDTHGRVIYLKSFSKAIFPAIiRLaAICLPQALK 298 

A++Y VYI+EDDY+ DF + P+HYYDTH R+IYLKSFS ++FPALR+ A+ LP LK 
60 Sbjct: 240 AQRYQVYILEDDYLGDFVKLKEPPIHYYDTHHRIIYLKSFSMSVFPALRIGALVLPSGLK 299 

Query: 299 STFMAYKKLMDYDTNLILQKALALYIENGLYAKNSQYLKYRYQKDLANSKSILADHPNLP 358 

F+ K L+D DTNL++QKALALY+ENG++ KN +++K RY K ++ N P 

Sbjct: 300 PHFLTQKSLIDLDTNLLMQKALALYLENGMFQKNLRFIK-RYLKQRERQLALFLRQ-NCP 357 

65 



45 
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Query: 359 S--YSLHHDSVLFDCSKLDNFK 378 

Y L •++ D + D+++ 
Sbjct: 358 DIHYQLTPTHLVIDYTTSDSYR 379 ' . ' 

5 SEQ ID 6884 (GBS423) was expressed in E.coU as a His-fusion product. SDS-PAGE analysis of total cell 
extract is shown in Figure 79 (lane 7; MW 49.3kDa). It was also expressed in E.coli as a GST-fusion 
product. SDS-PAGE analysis of total cell extract is shown in Figure 172 (lane 2; MW 74kDa). 

GBS423-GST was purified as shown in Figure 219, lane 2-3. 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
10 vaccines or diagnostics. 

Example 2230 

A DNA sequence (GBSx2349) was identified in S.agalactiae <SEQ ID 6887> which encodes the amino 
acid sequence <SEQ ID 6888>. This protein is predicted to be pseudouridylate synthase I (truA). Analysis 
of this protein sequence reveals the following: 

15 Possible site: 58 

>» Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty^O. 3265 (Affirmative) < suco 

20 bacterial membrane — Certainty=0.0000 (Not Clear) < suco 

bacterial outside — Certainty=0.0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:BAB03886 GB:AP001507 tRNA pseudouridine synthase A 
25 (pseudouridylate synthase I) [Bacillus halodurans] 

Identities = 105/240 (43%) , Positives = 147/240 (60%) , Gaps = 2/240 (0%) 

Query: 1 MTRYKAQISYDGSAFSGFQRQPNCRTVQEEIERTLKRIiNSGNDVIIHGAGRTDVGVHaYG 60 
M R +++YDG+ F+G+Q QPN RTVQ E+E LK ++ G + + +GRTD GVHA G 
30 Sbjct: 1 MKRIGLKVAYDGTDFAGYQIQPNERTVQGELESVLKNIHKiGMSIRVTASGRTDTGVHftRG 60 

Query: 61 QVIHFDLPQARDVEKIiRFGIiDTQCPDDIDIVKVEQVSDDFHCRYDKHIKTYEFLVDIGRP 120 

Q++HFD + V++ L++Q P DI +++ V DFH RY K Y + V 
Sbjct: 61 QIVHFDTSLSFPVDRWPIALNSQLPADICVLEAftDVPADPHARYSAKTKEYRYRVLTSAQ 120 

35 

Query: 121 KNPMMRNYATHYPYPVIIELMQEAIKDLVGTHDFTGFTASGTSVENPCVRTIFDAKIQFEA 180 

'+ RNY H YP+ +E MQ A L+GTHDF+ F A+ VE+KVRTI D + E 
Sbjct: 121 ADVFRRim^VRYPLDVEAMQRRAVQLLGTHDFSSFCaAKAEVEDKVRTIEDVALWREG 180 

40 Query: 181 SKNLLIFTFTGNGFLYKQVENMVGTLLKIGNGRMPISQIKTILQAKNRDLAGPTAACaiGL 240 

+ LIF+ GNGFLY VR +VGTLL+IG G+ ++ IL A++R+ AG TA G+GL 
Sbjct: 181 DE--LIFSIRGNGFLYNMVRIIVGTriLEIGAGKRSAEEVAKILAaRSREAAGKTAPGHGL 238 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6889> which encodes the amino acid 
45 sequence <SEQ ID 6890>. Analysis of this protein sequence reveals the following: 

Possible site: 58 

»> Seems to have no N-terminal signal sequence 

Final Results 

50 bacterial cytoplasm Certainty=0 . 2558 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0.0000 (Not Clear) < suco 



An alignment of the GAS and GBS proteins is shown below. 
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Identities = 184/249 (73%) , Positives = 214/249 (85%) 



Query: 


1 


MTRYKAQI SYDGSAFSGFQRQPNCRTVQEEIERTLKRLNSGNDVI IHGAGRTDVGVHAYG 


60 






M RYKA ISYDG+ FSGFQRQ + RTVQEEIE+TL +LN+G +IIHGAGRTD GVHAYG 




Sbjct: 


1 


MVRYKATISYDGTLFSGFQRQRHLRWQEEIEKTLYKtNHGTKIIIHGAGRTDAGVHAYG 


60 


Query: 


61 


QVIHFDLPQaRDVEKmFGLDTQCPDDIDIVKVEQVSDDFHOlYDKHIKTYEFLVDIGRP 


120 






QVIHFDLPQ ++VEKI1RF LDTQ P+DID+V +E+V+DDFHCaiY KH+KTYEPLVD GRP 




Sbjct: 


61 


QVIHFDLPQEQEVEKIiRPAIiDTQTPEDIDVVNIEKVaDDFHCRYQKHLKryEFLVDNGRP 


120 


- Query: 


121 


KNPMMRNYATHYPYPVIIELMQEAIKDLVGTHDFTGFTASGTSVENKVRTIFDAKIQFEA 


180 






KNPMMR+Y THYPY + I+LMQEAI LVGTHDFTGFTA+GTSV+NKVRTI A + + 




Sbjct: 


121 


KNPMMRHYTTHYPYTIJIIKIjMQEAINGLVGTHDFTGFTAAGTSVQN^ 


180 


Query: 


181 


SK^nJLIFTFTGNGFLYKQVRlJMVGTLLKIGNGRMPISQIKTILQaKNRDr^^ 


240 






+ L+FTF+GNGFLYKQVRNMVGTLLKIGNG+MP+ Q+K IL +KNR LAGPT +GNGL 




Sb j Ct : 


181 


KTDFLVFTFSGNGFLYKQVRNMVGTLLKIGNGQMPVEQVKVILSSKNRQLAGPTISGNGL 


240 


Query: 


241 


YLKEIIYED 249 








YLKEI YE+ 




Sbjct: 


241 


YLKEICYEN 249 





Based on this analysis, it was predicted that these proteias and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2231 

A DNA sequence (GBSx2350) was identified in S.agalactiae <SEQ ID 6891> which encodes the amino 
acid sequence <SEQ ID 6892>. This protein is predicted to be phosphomethypyrimidine kinase (thiD). 
Analysis of this protein sequence reveals the following: 

Possible site: 45 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certaintyi=0. 2051 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB15828 GB:Z99123 phosphomethylpyrimidine kinase [Bacillus subtilis] 
Identities = 95/253 (37%) , Positives = 150/253 (58%) , Gaps = 13/253 (5%) 



Query: 


1 


MKTRNVLAISGNDIFSGGGLHRDLATYVVNKLHGFVAVTCLTAMSDKG- - - FEVIPIEAS 


57 






M L I+G+D G G+ ADL T+ ++G A+T + AM +V PI+ 




Sbjct: 


1 


MSrfflKALTIAGSDSSGGAGIQRDLKTFQEKNVYGOTALTVIVaMDPNNSWNHQVFPIDTD 


60 


Query: 


58 


ILKQQLESLKD-VEFGSIKLGLLPNVETAQWLEFVKSKQECPWLDPVLVCKENHDL-- 


114 






++ QL ++ D + ++K G+LP V+ ++ + +K KQ W+DPV+VCK +++ 




Sbjct: 


61 


TIRAQIATITDGIGTOAMKTGMLPTVDIIEUiAKTIKEKQLKNVVIDPVWCKGSUJEVLY 


120 


Query: 


115 


--EVSQLREQLIAFFPYADVITPNLVEAQLLTGLS-IENLDQMKIAAEKLYDMGAKHWI 


171 






LREQL P A VITPNL EA L+G+ ++ +D M AA+K++ +GA++WI 




Sb j ct : 


121 


PEHAQALREQLA PLATVITPNLFEASQLSGMDELKTVDDMIEAAKKIHALGAQYWI 


177 


Query: 


172 


KGGNRIiNAEEATDLYYDGERFETYVFPWDANNT-GAGCTFASSIASQLAMGKNVEDAVK 


230 






GG +L E+A D+ YDGE E ++D T 6AGCTF++++ ++I1A G V++A+ 




Sb j ct : 


178 


TGGGKLKHEKAVDVLYDGETAEVLESEMIDTPYTHQAGCTFSAAVTAELAKGAEVKEAIY 


237 


Query: 


231 


MSKGFVYQAIKAS 243 








+K F+' AIK S 




Sb j ct : 


238 


AAKEFITAAIKES 250 
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A related DNA sequence was identified in S.pyogenes <SEQ ID 4407> which encodes the amino acid 
sequence <SEQ ID 4408>. Analysis of this protein sequence reveals the following: 

Possible site: 36 

»> Seems to have no N-terminal signal sequence 

Pinal Results 

bacterial cytoplasm — Certainty=0. 2029 (Affirmative) < suco 

bacterial membrane Certainty=0.0000(Not Clear) < suco 

bacterial outside Certaiiity=0 . 0000 (Not Clear) < suco 



An alignment of the GAS and GBS proteins is shown below. 

Identities = 135/252 (53%) , Positives = 174/252 (68%) 

Query: 1 MKTRlTOiAISGNDIFSGCMLHADIiATYVVNKLHGFVATCCLTAMSDKGFEV^ 60 

15 MKT ++ ISGNDI SGGGL+ADrATY+ L FmVTCLT S++GF + P+ I + 

Sbjct: 1 MKTDYIVTISGNDILSGGGLYADLATYIRYDLQAFVAVTCLTTRSEEGFSLFPVAKEIFR 60 

Query: 61 QQLESLKDVEFGSIKLGLLPNVETAQWLEFVKSKQECPWLDPVLVCKENHDLEVSQLR 120 
QL S + +IK+GLLPN E ++VL+F+K PWLDPVL CKE D+++ LR 

20 Sbjct: 61 DQIiNSFTNAPISAIKIGLLPNAEMCEIVIJDFIKGHLGIPVVLDPVLACKEIDDVKIVPI^ 120 

Query: 121 EQLIAFFPYADVITPNLVEAQLLTGLSIENLDQMKIAAEKLYDMGAKmA?IKGGNRIJ!C^ 180 

++++ PY V+TPIULVEAQLL+ I +L M+ AAh- Y +GftK WIRGGHR + + 
Sbjct: 121 QEILQLLPYVTVVTPI&VEftQLLSQKEIVSLKDMQEAAKYFYQLGSyKQVVIKiGGim 180 

25 

Query: 181 EATDLYYDGERFETYVFPVVDANNTGAGCTFASSIASQLAMGKNVEDAVKMSKGFVYQAI 240 

+A DL+YDG+ T PV++ NN GAGCTFASSIASQL K +AVK SK VYQAI 
Sbjct: 181 KAIDLFYrK3KEIVTLECPVLEKOTJIGAGCTFASSIASQLVKKKTPLEAVKNSKELVYQAI 240 

30 Query: 241 KASDKYGWQHF 252 

SD+YGV Q + 
Sbjct: 241 LQSDRYGVKQSY 252 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
35 vaccines or diagnostics. 

Example 2232 

A DNA sequence (GBSx2351) was identified in S.agalactiae <SEQ ID 6893> which encodes the amino 
acid sequence <SEQ ID 6894>. Analysis of this protein sequence reveals the following: 

Possible site: 45 
40 »> Seems to have a cleavable N-term signal seq. 

INTEGRAL Likelihood = -6.05 Transmembrane 97 - 113 ( 96 - 119) 
INTEGRAL Likelihood = -0.22 Transmembrane 54 - 70 ( 54 - 70) 

Final Results 

45 bacterial membrane Certainty=0. 3421 (Affirmative) < suco 

bacterial outside Certainty^O. 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintyi=o . 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

50 >GP:BAA30952 GB:AP000007 202aa long hypothetical protein [Pyrococcus 

horikoshii] 

Identities = 48/148 (32%) , Positives = 78/148 (52%) , Gaps = 9/148 (6%) 

Query: 10 VQLAIVTAI^IVLGMFISIPTPTGFLTLLDAGIFFAAFYFGKKEGAWGALAGFLIDLLK 69 

55 V A+VTA+++V+ I IP G+L D I + FG G G + DLL 

Sbjct: 49 VMAALVTAMTMVIR--IPIPASQGYLNFGDIMIMLTSVLFGPLVGGFAGGVGSAFADLL- 105 



Query: 70 GYPNWMFFSLLIHGTQGYLAGLPGR RRLLGLISATLVMVLGYAIASGLMYGWGA 123 

GYP+W F+L+I GT+G + G + + LLG + VMV+GY + ++YG 



10 
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Sbjct: 106 GYPSWMiFTLVIKGTEGIlVGYFSKGEfiNYGKILLGTVLGGSVMVIGWSVAYVLYGPAG 165 

Query: 124 VLPDIPGNIMQNMVGMWGFALNKSLER 151 

+ ++ +I+Q + G+V+G L L++ 
Sbjct: 166 AIGELYNDIVQAVSGIVIGGGLGYILKK 193 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6895> which encodes the amino acid 
sequence <SEQ ID 6896>. Analysis of this protein sequence reveals the following: 

Possible site: 54 

>>> Seems to have a cleavable N-term signal seg. 

INTEGRAL Likelihood = -4.62 Transmembrane 98 - 114 ( 97 - 119) 
INTEGRAL Likelihood = -0.00 Transmembrane 135 - 15,1 ( 135 - 151) 

15 Final Results 

bacterial membrane Certainty=0 . 2848 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0.0000 (Not Clear) < suco 

20 The protein, has homology with the following sequences in the databases: 

>GP:CAB49310 GB:AJ248284 hypothetical protein [Pyrococcus abyssi] 
Identities = 42/145 (28%) , Positives = 73/145 (49%) , Gaps = 10/145 (6%) 

Query: 7 RQMSLTGILTALVWLGRFVMLPTPT--GFLTLLDAGIYAVSFSFGSAQGAIVGGLSGPL 64 
25 R ++++ + ALV + + +P P G+L D I V+ FG G GG+ + 

Sbjct: 39 RTVAISAVRAALVTAMTMVIRIPIPASQGYIJSIF6DIMIMLVAVLFGPLVGGFAGGVGSAI 98 

Query: 65 IDLVAGYPQVilMFHSLIAHSVQGYFAGWRGR KRWLGWIGSFIMIFWYFLGSLML 118 

DL+ GYP W +LI +G G+ + K +G V+G FIM+ Y S +L 

30 Sbjct: 99 ADLI-GYPSWALFTLIIKGSEGLWGYPSKGEPNYSKILIGTVLGGFIMVLGYVSVSYVL 157 

Query: 119 GYGLSGSLAGIWGNVMQNTLGLFVG 143 

YG +G+++ ++ + +Q G+ +G 
Sbjct: 158 -YGPAGAISELYNDTVQAVSGIVIG 181 

35 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 77/155 (49%) , Positives = 106/155 (67%) , Gaps = 1/155 (0%) 

Query: 1 MRKEKTSQLVQLAIVTAISIVLGMFISIPTPTGFLTLLDAGIFPAAFYFGKKEGAVVBAL 60 
40 M+ K Q+ I+TA+ +VLG F+ +PTPTGFLTLIiDftGI+ +F FG +6A+VG L 

Sbjct: 1 MQNSKIRQMSLTGILTALWVLGRFVMLPTPTGFLTLLDAGIYAVSFSFGSAQGAIVGGL 60 

Query: 61 AGFLIDLLKGYPNWMFFSLLIHGTQGYLAGLPGRRRLLGLISATLVMVLGYAIASGLM-Y 119 
+GFLIDL+ GYP WMF SL+ H QGY AG GR+R LG++ + +M+ Y + S ++ Y 
45 Sbjct: 61 SGFLIDLVAGYPQWMFHSLIAHSVQGYFAGWRGRKRWLGWIGSFIMIFWYFLGSLMLGY 120 

Query: 120 GWGAVLPDIPGNIMQNMVGMWVGFALNKSLERVKK 154 

G LI 6N+MQN +G+ VGF + K++ R KK 
Sbjct: 121 6LSGSLAGIWGNVMQNTLGLFVGFIIFKAILRQKK 155 

50 

Based on this analysis, it was predicted that these proteins and their epitopes coiild be useftil antigens for 
vaccines or diagnostics. 

Example 2233 

A DNA sequence (GBSx2352) was identified in S.agalactiae <SEQ ID 6897> which encodes the amino 
55 acid sequence <SEQ ID 6898>. Analysis of this protein sequence reveals the following: 

Possible site: 43 

»> Seems to have no N-terminal signal sequence 



Final Results 
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bacterial cytoplasm Certainty^O . 0881 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < succ> 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB15708 GB:Z99122 alternate gene name: ipc-33d [Bacillus subtilis] 
Identities = 91/176 (51%) , Positives = 115/176 (64%) 

Query: 6 NKLKQETKAIWDIIERSALKKGQIFVLGLSSSEVSGGLIGKNSSSEIGEIIVEVILKEL 65 

N+LKQ K ++ + +++ LK+ Q+FVLG S+SEV+G IG + S +1 E I + + 
Sbjct: 2 NELKQTWKTMLSEFQDQaELKQDQLFVLGCSTSEVaGSRIGTSGSVDIAESiySGLftELR 61 

Query: 66 HSRGIYLAVQGCEHVNRALWEAELAERQQLEWNWPNLHAGGSGQVAAFKLMTSPVEV 125 

GI+LA Q CEH+NRALWEAE A+ +L V+ VP AGG+ AFK M SPV V 
Sbjct: 62 EKTGIHLAFQCCEHLNRALVVEftETAKIjFRLPOTSAVPVPKAGGaMASYAFKQMK^ 121 

Query: 126 EEIVAHftGIDIGDTSIG^O^IKRVQVPLIPISRELGGAHVTALASRPKDIGGaR2«3Y 181 

E I A AGIDIGDT IGMH+K V VP+ LG AHVT +RPKLIGG RA Y 

Sbjct: 122 ETIQArlAGIDIGDTFIG^fflLKF^aVPVRVSQNSIX3SAHVTLftRTRPKIiIGGVRAVy 177 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6899> which encodes the amino acid 
sequence <SEQ ID 6900>. Analysis of this protein sequence reveals the following: 

Possible site: 40 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2166 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty^O. 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 132/183 (72%) , Positives = 151/183 (87%) 

Query: 6 NKLKQETKAIWDIIERSALKKGQIFVLGLSSSEVSGGLIGKNSSSEIGEIIVEVILKEL 65 

N L+++T+ IV+D++ERSA++ G +FVLGLSSSE+ G IGK SS E+G+I+VEV+L EL 
Sbjct: 3 HISn^EKQTREIVIDVVERSAIQPGNLFVLGLSSSEILGSRIGKQSSLEVGQIWEVVLDEIi 62 

Query: 66 HSRGIYLAVQGCEHVNRALVVEAELAERQQLEVVN\ArtTOiHAGGSGQVaAFKIiMTSPVE^ 125 

+ RG++LAVQGCEHVNRALVVE +AE +QLE+VNWENI1HAGGS Q+AAF+LM+ PVEV 
Sbjct: 63 NKRGVHLAVQGCEHVNRALVVERHVAESKQLEIVNVVENLHaGGSAQMft2\FQIMSDE^ 122 

Query: 126 EEIVAHAGIDIGDTSIGMHIKRVQVPLIPISRELGGAHVTALASRPKLIGGARAGYTSDP 185 

EE++AHAG+DIGDT+IGMHIKRVQ+PLIP RELGGAHVTALASRPKLIGGARA Y D 
Sbjct: 123 EEVIAHftGLDIGDTAIGMHIKRVQIPLIPCQREIXSGAHVTALASRPKLIGGftRADYNMDI 182 

Query: 186 IRK 188 
IRK 

Sbjct: 183 IRK 185 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2234 

A DNA sequence (QBSx2353) was identified in S.agalactiae <SEQ ID 690 1> which encodes the amino 
acid sequence <SEQ ID 6902>. Analysis of this protein sequence reveals the following: 

Possible site: 56 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood =-11.25 Transmembrane 21 - 37 ( 13 - 46) 
INTEGRAL Likelihood = -4.30 Transmembrane 78 - 94 ( 76 - 113) 
INTEGRAL Likelihood = -2.07 Transmembrane 96 - 112 ( 95 - 113) 
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10 



25 



30 



Final- Results 

bacterial membrane '■ — Certainty=0 . 5501.(Af f irmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm . Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:BAB06385 GB:AP001516 unknown conserved protein [Bacillus halodurans] 
Identities = 105/261 (40%) , Positives = 150/261 (57%) , Gaps = 2/261 (0%) 

NVEEVLPTFFTiCLIS- - ILLLIIAFVITOQVIOTLFEKTVNRSiai'SRQKWARQKTLAKL 69 
N+ FT +1+ +L+ +IAF+IVR + + + R ++ R TL KL 
NITSGAFIASTFIIAGKVLVAVIAFLIVRAIGKRIISNSFARMAKmQLSSGRVVTLEKL 66 

15 Query: 70 SHmniiOTTLYFFLFYWILSILGVPISSLLBGaGIAGVaiGLQAQGFLSDVVNGFFILLEN 129 

S N +YTL F +L+I G+ S+L+AGftGI G+AIG GAQG +SD+V GFFILLE 

SimFSYnMFIFATTIiTIFGMPSRLIAGRGIVGIMGPGRQGLVSDIVTGFFILLEK 126 

QFrMSDIINVGWSGTVTNVGIRTTQIHDFDGTLHFIENRNITIVSNKSRSl^^ 189 
20 Q DVGD + G V G V VG+RT I FDGTLH+IENRNI VSN SR NMRA +DI 



Query: 


12 


Sbjct: 


7 


Query: 


70 


Sbjct: 


67 


Query: 


130 


Sbjct: 


127 


Query: 


190 


Sbj Ct : 


187 


Query: 


250 


Sbj ct : 


247 



+ N+D+ ++ K+ ++ + 1+ P V G + V RI T+N Q+ 



K ++A+ I++P 
ILLRKQLKEALEAHNIEIP 267 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6903> which encodes the amino acid 
sequence <SEQ ID 6904>. Analysis of this protein sequence reveals the following: 

Possible site: 54 
»> Seems to have no N-terminal signal sequence 
35 INTEGRAL Likelihood = -8.49 Transmembrane 24 - 40 ( 15 - 45) 

INTEGRAL Likelihood = -4.83 Transmeitibrane 78 - 94 ( 73 - 99) 
INTEGRAL Likelihood = -2.07 Transmembrane 96 - 112 ( 95 - 113) 

Final Results 

40 bacterial membrane Certainty=0. 4397 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the databases: 

45 >GP:BAB06385 GB:AP001516 unknown conserved protein [Bacillus halodurans] 

Identities = 104/249 (41%) , Positives = 151/249 (59%) , Gaps = 4/249 (1%) 

KKLVSLIILLLFFAILKRVTNYLFEKTINKSFAYSRQSERRKKTLSKLTHNILNYLLYFL 81 

K LV++I L+ AI KR+ + F+ + +SRTL KL+ N +Y L F+ 
KVLVAVIAFLIVRAIGKRIISNSFARMAKNN QLSSGRWTLEKLSLNAFSYTLMFI 78 

LIYWILSLFGIPVSSLLAGAGIAGVAIGLGAQGFLSDWNGFFILFENQFEVGDNVTISD 141 

+L++FG+ S+L+AGftGI G+AIG GAQG +SD+V GFFIL E Q +VGD VT 
FATTLLTIPGLNPSALIAGAfilVGLAIGFGAQGLVSDIVTGFFILLEKQXDVGDYVTAGG 138 

lEGSVFGVGIRTTQIRGFDGTLHFIPNRSITWSNKSRGNMRALIEIPLYSTVNLSQVTR 201 
++G V VG+RT IRGFDGTLH+IPNR+I VSN SRGNMRAL++I + N+ + 



++ +V + +1+ P+++G QN + RI TEN EQ+ + + +EA 





Query: 


22 


50 


Sbjct: 


23 




Query: 


82 




Sbjct: 


79 


55 , 








Query: 


142 




Sbjct: 


139 


60 


Query: 


202 




Sbjct: 


199 



Query: 262 LLKBGIQLP 270 
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L I++P 
Sbjct: 259 LERHNIEIP 267 

An alignment of the GAS and GBS proteins is sliown below. 

Identities = 164/265 (61%) , Positives = 215/265 (80%) 



Query: 


7 


FIDHRWEEVLFTFFTKLISILLLIIAFVIVRQVINYLFEKTVNRSLAFSRQKVARQKTL 


66 






+++ ++E + T F KL+S+++L++ F I+++V NYLFEKT+N+S A+SRQ AR+KTL 




Sbjct: 


7 


YLEQSHIENIGLTIFKKLVSLIILLLFFAILKRVTOYLFEKTINKSFAYSRQSEARKKTL 


66 


Query: 


67 


AKLSinmJSIYTLyFFLFYWILSILGVPISSLLAGAGIAGVAIGWSaQGFLSDV™ 


126 






H-KL+HN+ENY LYF L YWILS+ G+P+SSLLftGftGIAGWAIGLGAQGFLSDWNGPFIL 




Sbjct: 


67 


SKLTHNIUmiLYFLLIYWILSLFGIPVSSLLAGAGIAGVAIGIXSBiQGFLSDVViro 


126 


Query: 


127 


LENQFDVGDIINVGTVSGTVTNVGIRTTQIHDFDGTLHFIPNRNITIVSNKSRSNMRAQI 


186 






ENQF+VGD + + + G+V VGIRTTQI FDGTLHFIPNR+IT+VSNKSR NMRA I 




Sbjct: 


127 


FENQFEVGDim'ISDIEGSVFGVGIRTTQIRGFDGTLHFIPNRSITWSNKBRGNM^ 


186 


Query: 


187 


DIPLFVHTNLDQISDIOTKINEEYVSKHPAIVGEPT\rFGPTITONGQFVYRINIFTQNGA 246 






+IPL+ NL Q++ 1+ ++N++ + HP IVG+P + GP N+NGQF +RI IFT+NG 




Sb j ct : 


187 


EIPLYSTVNLSQVTRIIDEVNQKELPNHPQIVGKPNILGPQNNSNGQFTFRIAIFTENGE 246 


Query: 


247 


QFDIYAEFYKLYQKAILEEGIDLPT 271 








QF lY FY+nYQ+A+L+EGI LPT 




Sbjct: 


247 


QFKIYHTFYRLYQEALLKEGIQLPT 271 





Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2235 

A DNA sequence (GBSx2354) was identified in S.agalactiae <SEQ ID 6905> which encodes the amino 
acid sequence <SEQ ID 6906>. This protein is predicted to be RopA (tig). Analysis of this protein sequence 
reveals the following: 
Possible site: 20 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 1785 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside — Certaintyi=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9283> which encodes amino acid sequence <SEQ ID 9284> 
was also identified. 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6907> which encodes the amino acid 
sequence <SEQ ID 6908>. Analysis of this protein sequence reveals the foUowirig: 

Possible site: 49 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 0776 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 303/354 (85%) , Positives = 337/354 (94%) 
Query: 1 MSTSFENKAOmGIITFTISQDEIKPALDQAENKVKKDIaNVPGFRKGHMPRTVFNQKFGE 60 
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MSTSFENKATNRG+ITFTISQD+IKPALD+AFNK+KKDLN PGFRKGHMPR VENQKFGE 



Sb j ct : 


30 


MSTSFENKATNRGVITFTISQDKIKPMiDKAFNKIKKDLNAPGFRKGHMPRPVFNQKFGE 


89 


Query: 


61 


EALYENAIOiKjVLPKAYEAAVAELGLDVVAQPKIDWSMEKGQDWKLTAEVVTKPEVKLGD 


120 






E LYE+ALN+VLP+AYEAAV EIiGLDWaQPKIDWSMEKG++W L+AEWTKPEVKLGD 




Sb j ct : 


90 


EVLYEDMJIIVLPEAYEAAVTELGLDVVAQPKIDWSMEKGKEWTLSAEVVTKPEVKIjGb 


149 


Query: 


121 


YKDLSVEVDASKEVSDEEVDAKVERERNNLAELTVKDGEftAQGDTWIDFVGSVD^ 


180 






YK+L VEVDASKEVSDE+VDAK+ERER NLAEL +KDGEAAQGDTWIDFVGSVDGVEFD 




Sb j ct : 


150 


YKNLVVEVDASKEVSDEDVDAKIERERQNLAELIIKDGEAAQGDTWIDFVGSVDGVEFD 


209 


Query: 


181 


GGKGDNFSLELGSGQFIPGFEEQLVGSKAGQTVDVNVTFPEDYQAEDIjAGKDAKFVTTIH 


240 






GGKGDNFSLELGSGQFIPGFE+QLVG+KftG V+VNVTFPE YQAEDLAGK AKF+TTIH 




Sb j ct : 


210 


GGKGDNFSLELGSGQFIPGFEDQLVGAKAGDEVEVNVTFPESYQaEDLAGKAAKFMTTIH 


269 


Query: 


241 


EWTKEVPALDDELAKDIDDEVETLDELKAKYRKELESAKEIAFDDAVEGAAIELRVANA 


300 






EVKTKEVP LDDELAKDID++V+TL++LK KYRKELE+A+E A+DDAVEGAAIELAVANA 




Sbjct: 


270 


EVKTKEVPEI^DELAKDIDEDVDTLEDLKVKYRKELEAAQETAYDDAVEGAAlELAVaNA 


329 


Query: 


301 


EIVELPEEMVHDEVHRAMNEFMGNMQRQGISPEMYFQLTGTTEEDLHKQYQaDA 354 








EIV+LPEEM+H+EV+R++NEFMGNMQRQGISPEMYFQLTGTT+EDLH QY A+A 




Sb j ct : 


330 


EIVDLPEEMIHEEVNRSVNEFMGNMQRQGISPEMYFQLTGTTQEDLHNQYSAEA 383 





Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2236 

A DNA sequence (GBSx2355) was identified in S.agalactiae <SEQ ID 6909> which encodes tiie amino 
acid sequence <SEQ ID 6910>. This protein is predicted to be galactose-6-phosphate isomerase laca subunit 
(rpiB). Analysis of this protein sequence reveals the following: 

Possible site: 26 

>>> Seems to have no N-terminal signal sequence 

Pinal Results 

bacterial cytoplasm Certainty=0. 3491 (Affirmative) < suco 

bacterial membrane — Certaintyi=0. 0000 (Not Clear) < suco 
bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAA25177 GB:M60447 galactose 6-P isomerase [Lactococcus lactis] 
. Identities = 92/141 (65%) , Positives = 115/141 (81%) 

Query: 1 MTIIIGADAHGVELKEVIRQHLTSLGKEIIDLTDTSKDFVDNTLAIVAKVNQKEDNLGIM 60 

M I++GAD G LK+V++ L G E+ID+T +DFVD TLA+ ++VN+ E NLGI+ 
Sbjct: 1 MAIWGADLKGTRLKDWKNFLVEEGFEVIDVTKDGQDFVDVTLAVASEVNKDEQNLGIV 60 

Query: 61 VXJAYGVGPFMVATKVKBMIAAEVSDERSAYMTRAHNNARMITLGSEIVGPGVAKHIVEGP 120 

+DAYG GPFMVATK+EGM+AAEVSDERSAYMTR HNNARMIT+G+EIVG +AK+I + F 
Sbjct: 61 IDAYGAGPFMVATKIKGMVaAEVSDERSAYMTRGHNNARMITVGaEIVGDELAKNIAKAP 120 

Query: 121 VDGTYDAGRHQIRVDMLNKMC 141 

V+G YD GRHQ+RVDMLNKMC 
Sbjct: 121 VNGKYDGGRHQVRVDMLNKMC 141 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6911> which encodes the amino acid 
sequence <SEQ ID 6912>. Analysis of this protein sequence reveals the following: 

Possible site: 45 

»> Seems to have no N-terminal signal sequence 
Final Results 
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bacterial cytoplasm Certaiiity=0 . 3224 (Affirmative) < suco 

bacterial membrane Certaiiity=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

5 An alignment of the GAS and GBS proteins is shown below. 

Identities = 101/140 (72%) , Positives = 117/140 (83%) 

Query: 1 MTIIIGADAHGVELKEVIRQHLTSLGKEIIDLTDTSKDFVDNTLAIVAKVNQKEDNLGIM 60 
M II+GfiDAHG LKE+I+ L G +IID+TD + DF+DHTIA+ VN+ E IjGIM 
10 Sbjct: 1 MAIILGftDMGNALKELIKSFLQEEGYDIIDVTDINSDFIDNTLAVAKaVNEAEGR^^ 60 

Query: 61 VDAYGVGPFMVATKVKGMIAJ^VSDERSAYMTRMJNNARMITLGSEIVGPGVAra 120 

VDAYG GPFMVATK+KGM+AAEVSDERSAYMTR HNNARMIT+G+EIVGP +AK+IV+GF 
Sbjct: 61 VDAYGAGPFMVATKLKG^WAAEVSDERSA■YMTRGHNNARMITIGAEIVGPELAKNIVKGF 120 

15 

Query: 121 VDGTYDAGRHQIRVDMLNKM 140 

V G YD GRHQIRVDMIiNKM 
Sbjct: 121 VTGPYDGGRHQIRVDMLNKM 140 

20 Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2237 

A DNA sequence (GBSx2356) was identified in S.agalactiae <SEQ ID 6913> which encodes the amino 
acid sequence <SEQ ID 6914>. This protein is predicted to be gaIactose-6-phosphate isomerase lacb 
25 subunit (rpiB). Analysis of this protein sequence reveals the following: 

Possible site: 35 

»> Seems to have no N-terminal signal sequence 

Final Results 

30 bacterial cytoplasm Certainty=0. 2511 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 10189> which encodes amino acid sequence <SEQ ID 
35 10190> was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAA25178 GB:M60447 galactose 6-P isomerase [Lactococcus lactis] 
Identities = 138/171 (80%) , Positives = 157/171 (91%) 

40 Query: 10 MKIAVGCDHIVTYDKIAWDYLKTKGYEVIDCGTYDNIRTHYPIYGKKVGEAVASGKADL 69 

M+IA+GCDHIVT K+AV ++LK+K6YEV+D GTYD++RTHYPIYGKKVGEAV SG+ADL 
Sbjct: 1 MRIAIGCDHIVPDVKMAVSEFLKSKGYEVLDFGTYDHVRTHYPIYGKKVGEAWSGQADL 60 

Query: 70 GVCICGTGVGINNAWKVPGIRSALVRDLTSAIYAKEEiasJANVIGPGGKITGGLIMrDII 129 
45 GVCICGTGVGINNAVNKVPG+RSALVRD+TSA+YAKEEUIANVIGFGG ITGGLLM DII 

Sbjct: 61 GVCICGTGVGINNAWKA^PGWSMjVEDMTSRLYAKEELNRNVIGFGGMITGGLLNMDII 120 

Query: 130 EAFIRAKYKPTKENKWliIEKIAEVETHNaHQEENDFFTEFIJJKWNRGEYHD 180 
EAFI A+YKPT+ENK LI KI VETHNAHQ + +FFTEFL+KW+RGEYHD 
50 Sbjct: 121 EAFIEAEYKPTEENKKLIAKIEHVETHNAHQaDEEFFTEFLEKWDRGEYHD 171 

A related DNA sequence was identified in S.pyogenes <SEQ ID 691 5> which encodes the amino acid 
sequence <SEQ ID 6916>. Analysis of this protein sequence reveals the following: 

Possible site: 57 
55 »> Seems to have no N-terminal signal sequence 
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Final Results 

bacterial cytoplasm Certaiiity=0. 3048 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

5 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 136/171 (79%), Positives = 160/171 (93%) 

Query: 10 MKlAVGCDHIVTYDKIAVVDYLKTKGYEVIDOjTYDNIRTHYPIYGKKVGEAVASGKftDL 69 
10 MKIA+GCDHIVT +K+AV D+LK+KGY+VIDCGTYD+ RTHYPI+GKKVGEAV +G+ADL 

Sbjct: 2 MKIAIGCDHIVTNEKMAVSDFLKBKGYDVIDCGTYDHTRTHyPIFGKKVGEAVVNGQADL 61 

Query: 70 GVCICGTGVGINNAVNKVPGIRSALVRDLTSAIYAKEELNAMVIGFGGKITGGLLMTDII 129 
GVCICGTGVGINNAVNKVPGIRSALVRD+T+A+YAKEEIJSIANVIGFGGKITG LLM DII 
15 Sbjct: 62 GVCIa3TGVGINNAVNKVPGIRSAr.VRD^m'ALYAKEERIANVIGFGGKITGErJ:J^ 121 

Query: 130 EAFIRAKYKPTKENKS/LIEKIAEVETHNAHQEENDFFTEFLDKWNRGEYHD 180 

+API+A+YK T+ENK LI KIA +E+H+A+QE+ DFFTEFL+KW+RGEYHD 
Sbjct: 122 DAPIKREYKETEENKKLIAKIAHLESHHftNQEDPDFFTEFLEKWDRGEYHD 172 

20 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2238 

A DNA sequence (GBSx2357) was identified in S.agalactiae <SEQ ID 6917> which encodes the amino 
25 acid sequence <SEQ ID 691 8>. Analysis of this protein sequence reveals the following: 

Possible site: 24 

>» Seems to have an uncleavable N-term signal seq 

Final Results 

30 bacterial metribrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty^O. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 10187> which encodes amino acid sequence <SEQ ID 
35 1 0 1 88> was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:ARA25179 GB:M60447 tagatose 6-P kinase [Lactococcus lactis] 
Identities = 192/310 (61%) , Positives = 236/310 (75%) 

40 Query: 11 MILTVTLNPSIDISYCLENE]M)TVNRVTDVSKTPGGKGIJm'RVLSQIK3DNVVATO^ 70 

MILTVTLNPS+DISY LE +DTVNRV DVSKT GGKGIiNVTRVL + GD V ATG LG 
Sbjct: 1 MILTVTraPSVDISYPLETLKIDTVNRVKDVSKTAGGRGUmRVLYESGDK^ 60 

Query: 71 GDFGDFIRSGLDALEIRHQFLSIGGETRHCIAVLHEGQQTEILEKGPHITKDERDAFLNH 130 
45 G G+FI S L+ + FIG TR+CIA+LHEG QTEILE+GP 1+ +EA+ FL+H 

Sbjct: 61 GKIGEFIESELEQSPVSPAFYKISGNTRNCIAIIiHEGNQTEILEQGPTISHEE2iEGFLDH 120 

Query: 131 LKLIFnRATIIWSGSLPKGLPSDYYARLISIANHFNKKVVLDCSGEALRSVLKSSAKPT 190 
+ + ++T+SGSLP GLP+DYY +LI LA+ WLDCSG L +VLKSSAKPT 

50 Sbjct: 121 YSNLIRQSEVVTISGSLPSGLPNDYYEKIiIQIASDEGVAVVIJDCSGAPLETOLKSSAKPT 180 

Query: 191 VIKPNLEELTQLIGKPISYSLDELKSTLQQDLFRGIDWVIVSLGARGAFAKHGNHYYQVT 250 

IKPN EEL+QL+GK ++ ++ELK L++ LF GI+W++VSLG GAFAKHG+ +Y+V 
Sbjct:. 181 AIKPNNEELSQLLGKEOTKDIEELKDVLKESLFSGIEWIVVSLGRNGAFAKHGDVFYKVD 240 



55 



Query: 251 IPKIEVINPVGSGDATVAGIASALEHQLDDTNLLKRAtTVXfiMLNAQETLTGHINLTYYQE 310 

IP I V+NPVGSGD+TVAGIASAL + D +I1I.K A LGMUSIAQET+TGH+N+T Y+ 
Sbjct: 241 IPDIPVVNPVGSGDSTVAGIASALNSKKSDADLLKHftMTI/SMLmQETMTGHVN^^ 300 
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Query: 311 EISQIQVKEV 320 

L SQI VKEV 
Sbjct: 301 liNSQIGVKEV 310 

5 A related DNA sequence was identified in S.pyogenes <SEQ ID 691 9> which encodes the amino acid 
sequence <SEQ ID 6920>. Analysis of this protein sequence reveals the following: 

Possible site: 14 

>>> Seems to have no N-terminal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0 . 1178 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

15 An alignment of the GAS and GBS proteins is shown below. 

Identities = 184/310 (59%) , Positives = 232/310 (74%) , Gaps = 1/310 (0%) 

Query: 11 MILTVTIJ!IPSIDISyCLENFNMDTVNRVTDVSKTPC3GRGMI^ 70 
+ILTVTLNP+ID+SY L+ DTVNRV IJV+RrPGGRGLNV+RVL++ G+ V ATG +G 
20 Sbjct: 1 VILTVTIJIPAIDVSYPLDELKCDTVNRVVDVTKTPGGKGiaWSRVIilSE 60 

Query: 71 GDFGDFIRSGLDALEIRHQFLSIGGETRHCIAVLHEGQQTEILEKGPHITKDEADAFLNH 130 

G+ GDFI + L I +F I G+TR CIA+LHEG QTEILEKGP ++ DE D P +H 
Sbjct: 61 GESGDFIINHLPD-SILSRFYKISGDTRTCIAILHEGNQTEILEKGPMLSVDEIDGFTHH 119 

25 

Query: 131 LKLIFDaATIIWSGSLPKGLPSDYyARLISLBNHENKKVVIiDCSGEALRSVLKSSAKPT 190 

K + + ++T+SGSLP G+P DYY +LI +AN KK VLDCSG AL +VLK +KPT 
Sbjct: 120 FKYLLOTDVDVVTLSGSLPAGMPDDYYQKLIKIANLNGKKTVLDCSGNALEAVLKGDSKPT 179 

30 Query: 191 VIKPNLEELTQLIGKPISYSLDELKSTLQQDLFRGIDWVIVSLGARGAFAKHGNHYYQVT 250 

VIKPNLEEL+QL+GK ++ D LK LQ +LF GI+W+IVSLGA G FAKH + +Y V 
Sbjct: 180 VIKPNLEELSQLLGKEMTKDFDALKEVLQDELFDGIEWIIVSLGftDGVFAKHKDTFYNVD 239 

Query: 251 IPKIEVINPVBSGnATVAGIASALEHQLDDTISILIJaUiNVIOO^ 310 
35 IPKI++++ VGSGD+TVAGIAS L + DD LL +ANVLGMUSIAQE TGH+N+ Y + 

Sbjct: 240 IPKIKIVSAVGSGDSTVAGIASGLANDEDDRALLTKANVLGMLNAQEKTTGHViq^^ 299 

Query: 311 LISQIQVKEV 320 
L I+VKEV 
40 Sbjct: 300 LYQSIKVKEV 309 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2239 

45 A DNA sequence (GBSx2358) was identified in S.agalactiae <SEQ ID 6921> which encodes the amino 
acid sequence <SEQ ID 6922>. This protein is predicted to be tagatose 1,6-diphosphate aldolase. Analysis 
of this protein sequence reveals the following: 



50 



55 



Possible site: 25 

>» Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0. 0369 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAA251B0 GB:M60447 tagatose 1,6-diP aldolase [Lactococcus 
lactis] 
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Identities = 253/325 (77%) , Positives = 295/325 (89%) " 

Query: 1 MGLTEQKQKHMEQLSDKNGIISALAFDQRGALKRLMAKYQSEEPTVSQIEALKVLVAEEL 60 

M LTEQK+K +E+LSDKNG ISALAFDQRGALKRLMA+YQ EPTV+Q+E LKVLVA+EL 
Sbjct: 1 MVLTEQKRKSLEKLSDKNGFISALAFDQRGaLKRLMAQYQDTEPTVAQMEELKAnjVOT^ 60 

Query: 61 TPYASSrttiLDPEYGLPATKVIJDDNAGLLLAYEKTGYDTSSTKRLPDCLDIWSAKRI^^ 120 

T YASSMLLDPEYGLPATK LD AGLLIiA+ERTGYDTSSTKRLPDCLD+WSAKRIKE+G 
Sbjct: 61 TKYASSmLDPEYGLPATKALDKEaGT.T.TiRFBKTGYDTSSTKRLPDCLDVWSAKRIKEQG 120 

Query: 121 ADAVKFLLYYDVDSSDEVNEEKEAYIERIGSECVAEDIPFFLEILSYDEKITDSSGIEYA 180 

ADAVKFIiLYYDVDSSDE+N++K+AYIER+GSECVAEDIPFFLEIL+YDE+I+D+ +EYA 
Sbjct: 121 ADAVKFLLYYDVDSSDELNQQKQAYIERVGSECVAEDIPFFLEIIiAYDEEISDAGSVEYA 180 

15 Query: 181 KIKPRKVIEAMKVFSNPRFNID^rtlKVEVPVHMDYVEGFAQGETAYNKATAAAYFREQDQA 240 

K+KPRKVIEAMKVFS+PRFNIDVLKVEVPVN+ YVEGFA GE Y+KA AA +F+ Q++A 
Sbjct: 181 KVKPRKVIEAMKVFSDPRFNIDVLKVEVPVNVKYVEGFADGEWYSKAEAADFFKAQEEA 240 

Query: 241 TIiLPYIFLSAGVPAQLFQETLVFAKEAGAKENGVLCGRATWAGSVKEYVEKGEAGaRQWL 300 
20 T LPYI+LSAGV A+LFQETL FA ++GAKFNGVLCGRATWa6SV+ Y+++GE AR+WL 

Sbjct: 241 OmPYIYLSAGVSAKLPQETLQFAHDSGAKENGVLCGRATWAGSVEPYIKEGEKaAREWL 300 

Query: 301 RTIGFQNIDELNKILQKTATSWKER 325 
RT GF+NIDELNK+L KTA+ W ++ 
25 Sbjct: 301 RTTGFENIDELNKVLVRTASPWTDK 325 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6923> which encodes the amino acid 
sequence <SEQ ID 6924>. Analysis of this protein sequence reveals the following: 

Possible site: 26 
30 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 0600 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

35 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 230/323 (71%) , Positives = 276/323 (85%) , Gaps = 1/323 (0%) 

40 Query: 3 LTEQKQKHMEQLSDKirailSftlAFDQRGALKRIJlAiaQSEEPWSQIEALKVLVAEELTP 62 

LTE K+K ME+LS +G+ISaLAFDQRGALKR+MA++Q++EPTV QIE LK LV+EELTP 
Sbjct: 5 LTENKRKSMEKLS-VDGVISALAFDQRGRLKBMMAQHQTKEPTVEQIEELKSLVSEELTP 63 

Query: 63 YASSMLLDPEYGLPATKVLDDNAGLLLAYEKTGYDTSSTKRLPDCLDIWSAKRIKEEGAD 122 
45 +ASS+LLDPEYGLPA++V + AGLLLAYEKTGYD ++T RLPDCLD+WSAKRIKE GA+ 

Sbjct: 64 FASSILLDPEYGLPASRVRSEEAGLLLAYEKTGYDATTTSRLPDCLDVWSAKRIKEftGAE 123 

Query: 123 AVKFLLYYDVDSSDEVHEEKEAYIERIGSECVAEDIPFFLEILSYDEKITDSSGIEYAKI 182 
AVKPLLYYD+D +VNE+K+AYIERIGSEC AEDIPF+LEIL+YDEKI D++ E+AK+ 
50 Sbjct: 124 AVKFLLYYDIDGDQDVNEQKKAYIERIGSECRAEDIPFYLEILTYDEKIADNASPEFAKV 183 

Query: 183 KPRKVIEAMKVFSNPRENinVLKVEVPVNMDYVEGFAQGETAYNKATAAAYFREQDQATL 242 

K KV EAMKVFS RF +DVLKVEVPVNM +VEGFA GE + K AA FR+Q+ +T 
Sbjct: 184 KAHKVNEAMKVFSKERFGV0VLKVEVPVHMKFVEGPAIX3EVLFTKEEAA^ 243 

55 

Query: 243 LPYIFLSAGVPAQLFQETLWAKEAGAKFNGVLCGRATWAGSVKEYVERGEAGARQWLRT 302 

LPYI+LSAGV A+LFQ+TLVFA E+GAKFNGVLCGRATWAGSVK Y+E+G AR+WLRT 
Sbjct: 244 LPYIYLSAGVSAKLFQDTLVFAAESGAKENGVLCGRATWAGSVKVYIEEGPQAAREWLRT 303 

60 Query: 303 IGFQNIDELNKILQKTATSWKER 325 

GF+NIDELNK+L KTA+ W E+ 
Sbjct: 304 EGFKNIDELNKVLDKTASPWTEK 326 
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Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2240 

A DNA sequence (GBSx2359) was identified in S.agalactiae <SEQ ID 6925> which encodes the amino 
acid sequence <SEQ ID 6926>. This protein is predicted to be lacx protein, chromosomal. Analysis of this 
protein sequence reveals the following: 
Possible site: 52 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 0543 (Affirmative) < suco 

bacterial membrane — Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — CertaintY=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 10185> which encodes amino acid sequence <SEQ ID 
101 86> was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAA25184 GB:M60447 ORF [LactOCOCCUS lactis] 
Identities = 173/298 (58%) , Positives = 219/298 (73%) 



Query: 


24 


MAITIQiraELQVTLKALGAmrSITDSQGVEYLWQGDATfWGGQRI'ILFPICGSVRNDCV 


83 






M I ++N L V K LG +TSI D G+EYLWQ D YW GQAPILFPICGS+RND 




Sbjct: 


1 


MTIELKMEYLTVQFKTLGGQLTSIKDKDGLEYLWQADPEYmGQAPILFPICGSLRNDWA 


60 


Query: 


84 


lYRPAQAPHFTGIIPRHGFVRHKTEDYDYISDSSVRFTIKSSKEMLINYPYRFSLEITYT 


143 






lYRP + P FTG+I RHGFVR + F + ++++SV F+IK + EML NY Y+F L + YT 




Sb j Ct : 


61 


lYRPQERPFFTGLIRRHGFVRKEEFTLEEVNENSVTFSIKPMAEMLDNYLYQFELRWYT 


120 


Query: 


144 


LRNKSIAITYIVKNLESEKNMPYAIGAHPGFNCPLFEKEVFSDYYLEFEQFETCTIPESF 


203 






L KSI + V NLE+EK MPY IGAHP FNCPL E E + DY LEF + E+C+IP+SF 




Sbjct: 


121 


LNGKSIRTEFQVTNLETEKTMPYFIGftHPAFNCPLVEGEKYEDYSLEFSEVESCSIPKSF 


180 


Query: 


204 


PDTGLLDLQARHPFLENQKQLSLNHALFEKDAITLDQLRSKTVYLKSRNHAKGIQLDFDD 


263 






P+TGLLDLQ R PFLENQK L Ij+++LF DAITLD+L+S++V L+SR KG+++DFDD 




Sbjct: 


181 


PETGLLDLQDRTPFLENQKSLDLDYSLFSHDAITLDRLKSRSVTLRSRKSGKGLRVDFDD 


240 


Query: 


264 


FENLILWTSNNGGPFIALBPWSSLSTSIEESDIIiEDKQNIVRUJPKQSKQHSIRITIL 321 






F NLILW++ N PF+ALEPWS LSTS+EE +ILEDK + ++ P + + S ITIL 




Sb j ct : 


241 


FPNLILWSTTNKSPFIALEPWSGLSTSLEEGWILEDKPQVTKVLPLDTSKKSYDITIL 298 



No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be usefiil antigens for 
vaccines or diagnostics. 

Example 2241 

A DNA sequence (GBSx2361) was identified in S.agalactiae <SEQ ID 6927> which encodes the amino 
acid sequence <SEQ ID 6928>. This protein is predicted to be ABC transporter. Analysis of this protein 
sequence reveals the following: 

Possible site: 49 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 3272 (Affirmative) < suco 
bacterial membrane — Certainty^O. 0000 (Not Clear) < suco 
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bacterial outside Certain,ty=0 . 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 10183> which encodes amino acid sequence <SEQ ID 
101 84> was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:Cfta51350 GB:X72832 leucine rich protein [Streptococcus 

equisimilis] 

Identities = 101/278 (36%) , Positives = 160/278 (57%) , Gaps = 1/278 (0%) 



Query: 


10 


MDFKELFPKVITKQEVKQSEDYIIVEQDGHVLHFPKSSLTKRELYLLQMTPSLEDASSVD 


69 






M+ K+ PPE+ ++++ V++ +HFPKS Ij+++E LL++ + 




Sbjct: 


1 


MELKDYFPEMQVGPHPLGDKEWVSVKEGDQYVHFPKSCLSEKERLLLEVGLGQYEVLQ-P 


59 


Query: 


70 


SQNPWYRYLVEGRGRLPQSHSAVQFIFIEHQFTLSEELKDFLSPLVINVETIMTINQTQS 


129 






+PW RYL++ +G PQ QFI + + HQ L +L + L ++ +E 1+ 1+ TQ+ 




Sbjct: 


50 


LGSPWQRYLLDHQGNPPQLFETSQFIYLNHQQVLPADLVELLQQMIAGLEVILPISTTQT 


119 


Query: 


130 


WILNQDNFFNATELLTDILPTIE]S^^E^ITRLRCYFGNSWTHLQAVDWKELYEEEYKLFTL 


189 






+ Q L +LPT+E+DF L + GN+W + A +E +EEE +L T 




Sbjct: 


120 


AFLCRQATSIKVLRSLEGLLPTLESDFGLALTMFViSNaWYQVaftGTr^ 


179 


Query: 


190 


PLSHKAEQHYCRFPKMALWALANQS PMPS IKAKCLQH I LDTSDTSAI I KALWQEQGNLAK 


249 






+Ii K+ F ++ LW++ + P++ + Q + SD + ++ MiW E GNL + 




Sbjct: 


180 


YLKQKSGGKLLTFAEVMLWSILSHQSFPACiTRQFHQFmPQSDMADVVHZiLWSEHGNLVQ 


239 


Qaexy: 


250 


TAKALFIHRNSLQYKLDKFTQSSGLNLKILDDLAYAYL 287 








TA+ L+IHRNSLQYKLDKF Q SGL+LK LDDLA+AYL 




Sbjct: 


240 


TAQRLYIHKNSLQYKLDKFAQQSGLHLKQLDDLAFAYL 277 





A related DNA sequence was identified in S.pyogenes <SEQ ID 6929> which encodes the amino acid 
sequence <SEQ ID 693 0>. Analysis of this protein sequence reveals the following: 

Possible site: 14 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 4332 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 106/287 (36%), Positives = 169/287 (57%), Gaps = 4/287 (1%) 



Query: 


3 


KTWED-AMDFKELFPEVITKQEVKQSEDYIIVEQDGHVLHFPKSSLTKRELYLLQM-TP 


60 






KTV++ AM+ K+ FPE+ +D++ +++ +HFPKS L+++E LL++ 




Sbjct: 


7 


KTVMKGMftMELKDYFPEMQVGPHPLGDJCDWMSIKEGDQYVHFPKSCLSEKERLLLEVGLG 


66 


Query: 


61 


SLEnASSVDSQNPWYRYLVEGRGRLPQSHSAVQFIFIEHQFTLSEELKDFLSPLVINVET 


120 






E + S PW RYL++ +G PQ + QFI++ HQ L ++L + L ++ +E 




Sbjct: 


67 


QCEVLQPLGS--PWQRYLLDHQGNPPQLYETSQFIYIJSIHQQALPDDLVELLQQMIAGLEV 


124 


Query: 


121 


IMTINQTQSVMILNQDNFFNATELLTDILPTIENDFNTRLRCYFGNSWTHLQAVDWKELY 


180 






1+ 1+ TQ+ + Q L D+LPT+E+DF L + GN+W + A +E + 




Sbjct: 


125 


ILPISATQTAFLCRQAISIKVLRWLEDLLPTLESDFGLALTMFVGNAWYQVaaGTLRECF 


184 


Query: 


181 


EEEYKLFTLFLSHKAEQHYCRPPKMALWALANQSPMPSIKAKCLQHILDTSDTSAIIKAL 


240 






EEE +L T +L ++ + F + LW+L + ++ + Q + SD + ++ AL 




Sbjct: 


185 


EEECQLLTAYLRQQSGRKLLTFSGLMLWSLLSHHTFLALTRQFHQFLSPQSDMADWHAL 


244 


Query: 


241 


WQEQGinj^AKALFIHRNSLQYKLDKFTQSSGIiNLKILDDIAYAYL 287 








W E GNL +TA+ L+IHRNSLQYKLDKF Q SGL+LK LDDLA+A+L 




Sbjct: 


245 


WSEHGNLVQTAQRLYIHRNSLQYKLDKFAQQSGLHLRQLDDLAFAHL 291 
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Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2242 

A DNA sequence (GBSx2362) was identified in S.agalactiae <SEQ ID 6931> which encodes the amino 
5 acid sequence <SEQ ID 6932>. This protein is predicted to be multiple sugar-binding transport ATP-binding 
protein msmk (malK). Analysis of this protein sequence reveals the following: 

Possible site: 17 

>» Seems to have no N-terminal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0 . 4392 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certaintyi=0. 0000 (Not Clear) < suco 

15 The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAA26938 GB:M77351 ATP-binding protein [Streptococcus rautans] 
Identities = 320/377 (84%) , Positives = 359/377 (94%) 

Query: 1 MVKimiNHIYKKyPSASHySVEDFDLDIKDKEFIVFVGPSGCGKSTTLRMIAGLEDISEG 60 

20 MVELNLNHIYKKYP++SHYSVEDFDLDIK+KEFIVFVGPSGCGKSTTLRM+AGLEDI++G 

Sbjct: 1 MVELNLNHIYKKYPNSSHYSVEDFDLDIKNKEFIVFVGPSGCGKSTTLRMVAGLEDITKG 60 

Query: 61 ELKIDGEWNDKSPKDRDIAMVFQNYALYPHMTVYDNMAFGLKLRKFSKQEIDKRVREAA 120 
ELKIDGEWNDK+PKDRDIAMVFQNYALYPHM+VYDNMAFGLKLR +SK+ IDKRV+EAA 
25 Sbjct: 61 ELKIDGEVVITOKAPKDRDIAMVFQNYALYPHMSVYraSIMAFGLKLRHySKEAIDKRVKEAA 120 

Query: 121 ANIGLTEFLERKPADLSGGQRQRVAMGRAIVRDAKVFLMDEPLSNLDAKLRVSMRAEIAK 180 

+GLTEFLERKPADLSGGQRQRVAMGRAIVRDAKVFLMDEPLSNLnAKr.RVSMRAEIAK 
Sbjct: 121 QILGLTEFLERKPADLSGGQRQRVAMGRAIVRDAKVFLMDEPLSNLDAKLRVSMRAEIAK 180 

30 

Query: 181 IHQRIGSTTIYVTHDQTEAMTLADRIVIMSATKNPDGDGTIGKIEQVGSPQELYNLPANK 240 

IH+RIG+TTIYVTHDQTEaMTLADRIVIMS+TKN DG GTIG++EQVG+PQEriYN PANK 
Sbjct: 181 IIHRRIGATTIYVTHDQTEAMTLADRIVIMSSTKNEDGSGTIGRVEQVGTPQELYNRPANK 240 

35 Query: 241 FVAGFIGSPSMNFFKVKVENGMIISEDGLRIAIPEGQEKLLESRGYRGKELIFGIRPEDI 300 

FVAGFIGSP+MNFF V +++G ++S+DGL IA+ EGQ K+LES+G+K K LIFGIRPEDI 
Sbjct: 241 FVAGFIGSPAMNFFDVTIKDGHLVSKDGLTIAVTEGQLKMLESKGFKNKNLIFGIRPEDI 300 

Query: 301 SSNLLVQDTYPNANVEAEVLVSELLGSETMLYVKLGQTEFASRVEARDFHNPGEKVNLTF 360 
40 SS+LLVQ+TYP+A V+AEV+VSELLGSETMLY+KLGQTEFA+RV+ARDFH PGEKV+LTF 

Sbjct: 301 SSSLLVQETYPDATVDREVVVSELLGSETMLYIiKLGQTEFAARVDARDFHEPGEKVSLTF 360 

Query: 361 NVAKGHFFDADTEQAIR 377 
NVAKGHFFDA+TE AIR 
45 Sbjct: 361 NVftlOSHFFDAETEAAIR 377 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6933> which encodes the amino acid 
sequence <SEQ ID 6934>. Analysis of-this protein sequence reveals the following: 

Possible site: 48 
50 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 4642 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

55 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 



An alignment of the GAS and GBS proteins is shown below. 

Identities = 332/377 (88%) , Positives = 359/377 (95%) 
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Query: 


1 


MVELNLNHIYKKYPSASHYSVEDFDLDIKDKEFIVFVGPSGCGKSTTLRMIAGLEDISEG 


60 






MVEIiNUslHIYKKyP+ +HY+VEDFDI1DIKDKEFIVFVGPSGCGKSTTLRMIAGLEDISEG 




Sbjct: 


1 


^l^rt;UNrl:iNHIYKK3•POT'THYAVEDroLDIKDKEFIVFVGPSGCG 


60 


Query: 


61 


ELKIDGEVVNDKSPKDRDIAIWFQNYALYPHMTVYDNMAFGLKI^KI'SKQEIDK^ 


120 






ELKI GEVVNDKSPKDRDIAMVFQNYALYPHMTVYDNMAFGLKLRK+ K +ID+RV+EAA 




Sb j ct : 


61 


ELKIGGEVVNDKSPKDRDIAMVFQNYALYPHMTVYDNMAFGLKLRKYKKDDIDRRVKEAA 


120 


Query: 


121 


ANIGLTEFLERKPADLSGGQRQRVAMGRAIVRDAKVFLMDEPLSNLDAKLRVSMRAEIAK 


180 






+GLTEFLERKPADLSGGQRQRVAMGRAIVRDAKVFLMDEPLSNLDAKLRVSMRAEIAK 




Sbjct: 


121 


QILGLTEFLERKPADLSGGQRQRVAMGRAIVRDAKOTLNffiEPLSNLDAKLRVSMRAEIAK 


180 


Query: 


181 


IHQRIGSTTIYVTHDQTEaMrLADRIVIMSATKOTDGDGTIGKIEQVGSPQELYNLPAN^ 


240 






IH+RIGSTTIYVTHDQTEAiMTIiAnRIVIMSaTKNP G+GTIGKIEQVGSPQELYNLPANK 




Sb j ct : 


181 


IHRRIGSTTIYVTHDQTEROTI^RIVIMSATKNPQGNGTIGKIEQVGSPQELYNLPANK 


240 


Query: 


241 


FVAGFIGSPSMNFFKVKVENGMIISEDGLRIAI PEGQEKLLESRGYKGKELIFGIRPEDI 


300 






FVAGFIGSP+MNFF+V+V++G I+SEDGL lAIPEGQ K+I1E+ GYKG+++ FGIRPEDI 




Sbjct: 


241 


FVRGFIGSPAMNFFEVEVia3GRIVSEDGII)IAIPEGQaKMIiEaaGYKGEKVTFGII^ 


300 


Query: 


301 


SSrn^LVQDTYENftNVEREVLVSELLGSETMLYVKLGQTEFASRVE^ 


360 






SS +V DTYP+A V AEVLVSELLGSETMLYVKLGQTEFASRV+ARDFH+PGE+V+LTF 




Sb j ct : 


301 


SSRQIVHDTYPSATVTAEVLVSELLGSETMLYVKLGQTEFASRVDARDFHSPGEQVSLTF 


360 


Query: 


361 


NVAKGHFFDADTEQAIR 377 








NVAKGHPFD DTEQAIR 




Sb j ct : 


361 


NVARGHPFDRDTEQAIR 377 





Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2243 

A DNA sequence (GBSx2363) was identified in S.agalactiae <SEQ ID 6935> which encodes the amino 
acid sequence <SEQ ID 6936>. This protein is predicted to be glucan 1,6-alpha-glucosidase (dexB) (treC). 
Analysis of this protein sequence reveals the following: 

Possible site; 56 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm --- Certainty=0 . 2525 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certaintyi=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CRA51348 GB:X72832 glucan 1,6-alpha-glucosidase [Streptococcus 
eguisimilis] 

Identities = 413/535 (77%) , Positives = 476/535 (88%) , Gaps = 1/535 (0%) 

Query: 1 MKKHWWHKATIYQIYPRSFMDSDGDGVGDIKGITSKLDYLEKLGITAIWLSPVYQSPMDD 60 

M+K WWHKATIYQIYPRSF D+ G+G+GD+RGITS+LDYL+KLGITAIWLSPVYQSPMDD 
Sbjct: 1 MQKQWWHKATIYQIYPRSFKDTSGNGIGDLKGITSQLDYLQKLGITAIWLSPVYQSPMDD 60 

Query: 61 NGYDISDYQAIADIFGDMNDMDQLLQEANQRGIKIIMDLVVNHTSDEHAWFVEARENPNS 120 

NGYDISDY+AIA++FG+M+DMD LL AN+RGIKIIMDLWNHTSDEHAWFVEARENPNS 
Sbjct: 61 NGYDISDYEAIAEVFGNMDDNmDmiAANERGIKIimLVVNHTSDEiaWFVEAREN™ 120 

Query: 121 PERDFYIWRDEPNDLTSIFSGSAWEYDKVSGQYYLHLFSKRQPDIiNWENEALRHKIYDNIM 180 

PERD+YIWRDEPN+L SIFSGSAWE D+ SGQYYLHLPSK+QPDIiNWEN +R KIYDMM 
Sbjct: 121 PERDYYITCRDEENNLMSlFSGSAVmDEASGQYYimPSKKQPDIjNWE]^^ 180 



Query: 181 NFWIDKGIGGFRMDVIDLIGKIPDKGITGNGPKLHDYLKEMNRASFGKHDLLTVGETWGA 240 
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NFWI KGIGGFRMDVIDLIGKIPD ITGNGP+LHDYLKEMN+A+FG HD++TVGETWGA 
Sbjot: 181 NFWIAKGIGGFRMDVIDLIGKIPDSEITGNGPRLHDYLKEMNQATFGNHDVMWGETW 240 

Query: 241 TPDIAKQYSNPDKmELSMVFQPEHVGLQHKPiaPKWDYSDGLDVPALKDIFTKWQTQLEL 300 
5 TP+IA+QYS P+N+ELSMVFQFEHVGLQHKP+APKWDY++ LDVPALK IF+KWQT+L+L 

Sbjct: 241 TPEIARQYSRPENKELiSMWQPEHVGLQHKENAPKWDYMELDVPALKTIFSKWQTELKL 300 

Query: 301 GQGWNSLFWNNHDLPRVLSIWGNDSDNRKQSAKALAILLHLMRGTPYIYQGEEIGMTNYP 360 
G+GWNSLFWNNHDLPRVLSIWGNDS R++SAKALAII1LHLMRGTPYIYQGEEIGMTNYP 
10 Sbjct: 301 GEGWNSLFWNNHDLPRVLSIWGNDSIYREKSAKALAILLHLMRGTPYIYQGEEIGMTNYP 360 

Query: 361 FECLADVDDIESLNYAKEAMDNGVSEATILDSIRKVGRDNARTPMQWSQEHQAGFTKH 419 

P+ L +VDDIESIJSIYAKEaM+HGV A ++ SIRKVGRraSIARTPMQWS++ AGF++ 
Sbjct: 361 FKDLTEVDDIESLOTAKEAMENGVPAARVMSSIRKVGRDNaRTPMQWSKDIBAGFSEAQE 420 

15 

Query: 420 PWLAWPNYQEINVEAAm3TESIFYTYQKLVALRKEHDWLVDADFKLLETADKVFAYVR 479 

WL VNPNYQEINV AL + +SIFYTYQ+L+ALRK+ DWLV+AD+ LL TADKVFAY R 
Sbjct: 421 TWLPVNENYQEINVanALANQDSIFYOTQQLIALRKDQDWLVEADra^ 480 

20 Query: 480 QTDKERYLIVAlSmSDQNQSFEFPEAVKETIISOTEVQEVLSSNTLKPWDAFCIEL 534 

Q +E Y+IV N+SDQ Q F A E +I+NT+V +VL + L+PWDAFC++L 
Sbjct: 481 QFGEETYVIVVNVSDQEQVFAKDLAGSffiWITISrrDVDKVLETKHLQPWDAFCV^ 535 

A related DNA sequence was identified in S. pyogenes <SEQ ID 693 7> which encodes the amino acid 
25 sequence <SEQ ID 6938>. Analysis of this protein sequence reveals the following: 

Possible site: 56 

»> Seems to have no N-terminal signal sequence 

Final Results 

30 bacterial cytoplasm Certainty^O. 2793 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

35 Identities = 418/535 (78%), Positives = 474/535 (88%), Gaps = 1/535 (0%) 

Query: 1 MKKHWWHKATIYQIYPRSFMDSDGDGVGDIKGITSKLDYLEKLGITAIWLSPVYQSPMDD 60 

M HWWHKATIYQIYPRSF D+ G+G+GD+KGITS+LDYL+KLGITAIWLSPVYQSPMDD 
Sbjct: 1 MNNHV^KATIYQIYPRSFKDTSGNGIGDLKGITSQLDYLQKLGITAIWLSPVYQSPMDD 60 

40 

Query: 61 NGYDISDYQAIADIFGDMNDMDQLLQEANQRGIKIIMDLVVNHTSDEHAWFVEARENPNS 120 

NGYDISDY+AIAD+FGDM DMD+LL AN+R6IKIIMDLWNHTSDEHAWFVEftRENENS 
Sbjct: 61 NGYDISDYEAIADVFGDMADMDELLAAANERGIKIIMDLWNHTSDEHAWFVEARENENS 120 

45 Query: 121 PERDFYIWRDEPNDLTSIFSGSAWEYDKVSGQYYLHLFSKRQPDLNWENEALRHKIYDMM 180 

PERD+YIWRDEPN+L SIFSGSAWE D+ SGQYYLHLFSK+QPDUSIWEN LR KIYDMM 
Sbjct: 121 PERDYYIWRDEPNNLMSIFSGSAlffil^EASGQYYimFSKICQPDIJJWENAQLRQKIYDMM 180 

Query: 181 NFWIDKGIGGFR^CWIDLIGKIPDKGITGNGPKI■HDYLKEMIS1RASFGKHDLLTVGETWGA 240 

50 NFWI KGIGGFRMDVIDLIGK+PD ITGNGP+LHDYLKEMN+A+FG HD++TVGETWGA 

Sbjct: 181 NFWIAKGIGGFRMDVIDLIGKVPDLEITGNGPRLHDYLKEMNQATFGNHDVMTVGETWGA 240 

Query: 241 TPDIAKQYSNPDNEELSMVFQFEHVGLQHKPDAPKWDYSDGLDVPALKDIFTKWQTQLEL 300 
TP+IA+QYS P+N+ELSMVFQFEHVGLQHKPnAPKWDY+ LDVPALK IF+KWQT+L+L 
55 Sbjct: 241 TPEIARQYSRPENKELSMVFQFEHVGLQHKPDAPKWDYAKELDVPALKAIFSKWQTELKL 300 

Query: 301 GQGmSLFWNNHDLPRVLSIWGNDSDNRKQSAKALAILLHLMRGTPYIYQGEEIGMTNYP 360 

G+GWNSLFWNNHDLPRVLSIWGNDS R++SAKALAILLHLMRGTPY1YQGEEIGMTNYP 
Sbjct: 301 GEGWNSLFWMsraDLPRVLSIWGNDSTYREKSAKALAILLHIMRGTPYIYQGEEIGMTNYP 360 

60 

Query: 361 FECLRDVDDIESLNYAKEAMDNGVSEATILDSIRKVGRDNARTPMQWSQEHQAGFTKG-T 419 

F+ L +V+DIESLNYAKEAM NGVS A ++DSIRKVGRDNARTPMQWS++ AGF++ 
Sbjct: 361 FKDLTEV^roIESIJIYAKEaMGNGVSftARV^lDSIRKVGRDNARTPMQWSKDTHAGFSEAKE 420 

65 Query: 420 PWLAVNPNYQEINVEAALNDTESIFYTYQKLVALRKEHDWLVDADFKLLETADKVFAYVR 479 
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WL VNPNYQ+INV AL D +SIFYTYQKL+M)RKE DWLV+AD+ LL TADKVFAY R 
Sbjct: 421 TWLPVNPNyQDINVOTMiaDPDSIFYTYQKLIJU^RKEQDWLVEADy^ 480 

Query: 480 QTDKERYLIVANLSDQNQSFEFPEAVKETIISNTEVQEVLSSNTLKPWDAFCIEL 534 
5 Q +E Y+IV N+SD+ Q F A + II+NT+V VL + L+PWDAFC++L 

Sbjct: 481 QLGEETOVrVTOVSDEEQVB7OT)LAGAC3irriAHTDVDT^ 535 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

10 Example 2244 

A DNA sequence (GBSx2364) was identified in S.agalactiae <SEQ ID 6939> which encodes the amino 
acid sequence <SEQ ED 6940>. Analysis of this protein sequence reveals the following: 



15 



20 



Possible site: 44 

>>> Seems to have an uncleavable N-term signal seq 



Final Results 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside CertaintY=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAB49738 GB:D21942 UDP-galactose 4-epimerase [Streptococcus mutans] 
Identities = 267/331 (80%) , Positives = 306/331 (91%) 

25 Query: 1 MAVLILGGAGYIGSHMVDQLITQGKEKVIWDNLVTGHRQAVHSDAIFYEGDLSDKTFMR 60 

MA+L+LGGAGYIGSHMVD+LI +G+E+V+WD+LVTGHR AVH A FY+GDL+D+ FM 
Sbjct: 1 MAILVLGGAGYIGSHMVDRLIEKGEEEWWDSLVTGHRAAVHPAAKFYQGDIiADREFMS 60 

Query: 61 QVFRENPDVDAVIHFAAFSLVaESMENPLKYFDIOT'AGMIKLLEVMSIECIjlKNIV^^ 120 
30 VFRENPDVDAVIHFAA+SLVaESM+ PLKYFnNNTAGMIKLLEVM+E +K IVFSSTA 

Sbjct: 61 MVFRENPDVmVIHFAAYSLVAESMKKPLKYFDim'AGMIKLIjEVMSEPGVKyiVFSSTA 120 

Query: 121 ATYGIPEQVPILETAPQNPINPYGESKLMMETIMKWADQAYGIKFVALRYFNVAGDKPDG 180 
ATYGIP ++PI ET PQ PINPYGESKLMMETIMKW+D+AYGIKFV +RYENVAG KPDG 
35 Sbjct: 121 ATYGIENEIPIKETTPQRPINPYGESKLMMETIMKWSDRAYGIKFVPVRYFNVAGAKPDG 180 

Query: 181 SIGEDHKPETHLLPIII^KftQGVRDKIMIFGDDYNTPIXSTNVRDYVHPFDLaDAHILAV^ 240 

SXGEDH PETtmLPIILQVAQGVR+KIMIFGDDYOTPDGTNVRDYVHPFDLAD H+LA++ 
Sbjct: 181 SIGEDHSPETHIiPIILQVAQGVREKIMIFGDDYNTPDGTNVRDYVHPFDriADRHLIiAIJI 240 

40 

Query: 241 YLRQGNESNVFNLGSSTGFSNLQMLEAARRITGKEIPAQKAARRPGDPDTLIASSEKARQ 300 

YLRQGN S FNLGSSTGFSNLQ+LEAAR++TG++IPA+KAARR GDPDTLIASSEKAR+ 
Sbjct: 241 YLRQGNPSTAETILGSSTGFSNIiQILEAARKVTGQKIPAEKaARRSGDPDTLIASSEKARE 300 

45 Query: 301 ILGWEPKFnNIDKIISSAWAWHSSHENGYED 331 

++GW+P+FD+I+KII+SAWAWHSSHP GY+D 
Sbjct: 301 WGWKPQFDDIEKIIASAWAWHSSHPRGYDD 331 

Based on this analysis, it was predicted that this protein and its epitopes, could be usefiil antigens for 
50 vaccines or diagnostics. 

Example 2245 

A DNA sequence (GBSx2366) was identified in S.agalactiae <SEQ ID 6941> which encodes the amino 
acid sequence <SEQ ID 6942>. This protein is predicted to be two-component response regulator. Analysis 
of this protein sequence reveals the following: 

55 Possible site: 40 

>» Seems to have no N-terminal signal sequence 
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Final Results -• 

bacterial cytoplasm Certainty= 0.3 94 5 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:BAB06470 GB:AP001516 two-component response regulator [Bacillus halodiarans] 
Identities = 71/223 (31%) , Positives = 139/223 (61%) , Gaps = 7/223 (3%) 

Query: 3 VLIIEDDPMVEFIHKiraiEKIJQYFQNIYSTASQTQAIAYIJSIDIKIQLVLLDI 62 

VL+IEDDPMV+ ++R ++EKL+ F + +TA+ + + +++ L+LLDI + + +GL 
Sbjct: 9 VLLIEDDPMVQEVNRMFVEKLSGFTIVGTTATGEEGMVKTRELQPDLILLDIEMPKQDGL 68 

15 Query: 63 ELLKLLRNQHQNTEVIVISAANEAHTVKEAFHLGIVDYLIKPFTFERFESSIEKFLNHYH 122 

+K +R Q+ + ++I ++AAN+ T+K G++DYL+KPFTFER ++++ 4+ + 

Sbjct: 69 SFIKQIREQYIDVDIIAVTAAITOTKTIKTLLRYGVMDYLVKPFTFERLKAALTQYEEMFR 128 

Query: 123 TFEaD-KIYQDNIDHFQKIDSGWLEGEVKLDE--KiGLSEITYQHILnAIQELEQPFTIQE 179 
20 + + ++ QD++D K + + +D+ KGL T Q +++ ++EL++P + +E 

Sbjct: 129 KMQKEAELSQDSLDEMIK QKQAQftNMDDLPKGLHfiHTLQQVIERLEELDEPKSAEE 184 

Query: 180 LAKCSQFSHVSVRKYIAYMEEKGLLTSQQIYTKVGRPYKVYKL 222 
+ + + V+VR+Y+ Y+E G + Y +GRP + YKL 

25 Sbjct: 185 IGRDVGLftRVTVRRYLNYLESVGQVEMDLTYGSIGRPIQTYKL 227 

A related DNA sequence was identijBed in S.pyogenes <SEQ ID 6943> which encodes the amino acid 
sequence <SEQ ID 6944>. Analysis of this protein sequence reveals the following: 

Possible site: 37 
30 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 4053 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

35 bacterial outside Certainty^O . 0000 (Not Clear) < suco 

An alignment of tlie GAS and GBS proteins is shown below. 

Identities = 123/220 (55%) , Positives = 156/220 (70%) 

40 Query: 1 mVLIIEDDPMVEFIHRNYLEKUWFQNIYSTASQTQAIAYLSTOIKIQIiVLLDlHIKEGN 60 

M+VLIIEDDPMV+FIHRNYLEKIiN FIS+S +LDI L+LLDIHI +GN 

Sbjct: 1 MNVLIIEDDPMVDFIHRNYLEKLNLFDRIISSDSMKAVQSILTDYAIDLILLDIHITDGN 60 

Query: 61 GLELLKLLRNQHQNTEVIVISAftNEAHTVKEAFHLGIVDYLIKPFTFERFESSIEKFLNH 120 
45 G++ L+ R QH EVI+ISRflN+ + +++ PHLGI+DYLIKPFTFERF+ SI++F+ H 

Sbjct: 61 GIQFLEKWRTQHIPCEVIIISAANDGNIIRDGEHLGIIDYLIKPFTFERFQESIQQFVTH 120 

Query: 121 YHTFEADKIYQDNIDHFQKIDSGWLEGEVKLDEKGLSEITYQHILDAIQELEQPFTIQEL 180 

++ Q ID + + S +L EKGLSE T+Q I++ 1+ +QPFTIQEL 

50 Sbjct: 121 REHLANQQLEQAQIDQLKCLTSKKDTKNKQLLEKGLSESTFQWIMENIKVFDQPFTIQEL 180 

Query: 181 AKCSQFSHVSVRKYIAYMEEKGLLTSQQIYTKVGRPYKVY 220 

A SHVSVRKYIAY+EE L SQQI+TKVGRPY+VY 
Sbjct: 181 ASACHLSHVSVRKYIAYLEENKQLNSQQIFTKVGRPYRVY 220 

55 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2246 

A DNA sequence (GBSx2367) was identified in S.agalactiae <SEQ ID 6945> which encodes the amino 
60 acid sequence <SEQ ID 6946>. Analysis of this protein sequence reveals the following: 



wo 02/34771 



PCT/GBOl/04789 



10 



45 



-2534- 

Possible site: 21 

>>> Seems to have an \mcleavable N-term signal seq 

INTEGRAL Likelihood = -8.76 Transmembrane 12 - 28 ( 6 - 34) 
INTEGRAL Likelihood = -7.43 Transmembrane 178 - 194 ( 173 - 197) 



Final Results 

bacterial membrane . Certainty=0. 4503 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9003> which encodes amino acid sequence <SEQ ID 9004> 
was also identified. Analysis of this protein sequence reveals the following: 

Lipop: Possible site: -1 Crend: 3 

SRCFLG: 0 
15 McG: Length of UR: 27 

Peak Value of UR: 2.99 
Net Charge of CR: 3 
McG: Discrim Score: 12.92 
GvH: Signal Score (-7.5): -2.57 
20 Possible site: 19 

>» Seems to have an uncleavable N-term signal seq 

Amino Acid Composition: calculated from 1 

ALOM program count: 2 value: -8.76 threshold: 0.0 

INTEGRAL Likelihood = -8.76 Transmembrane 10 - 26 ( 4-32) 
25 INTEGRAL Likelihood = -7.43 Transmembrane 176 - 192 ( 171 - 195) 

PERIPHERAL Likelihood = 3.18 149 
modified ALOM score : 2.25. 
icml HYPID: 7 CFP: 0.450 

30 *** Reasoning Step: 3 

Final Results 

bacterial membrane Certainty=0 .4503 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

35 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GEiNPEPT database. 

>GP:CAB15141 GB:Z99120 similar to two-component sensor histidine 
kinase [YufM] [Bacillus subtilis] 
40 Identities = 132/461 (28%) , Positives = 245/461 (52%) , Gaps = 7/461 (1%) 

Query: 3 MKKKLSLWAPLSLILVTMTICIFSIFYYVTIHQSYRMVRVQEEKILKNTGyALSRNPQVI 62 

MKK L L L++ + + + I ++ Q+ + +R QE+ T ++ P 

Sbjct: 1 MKKTLKLQTRLTIFVCIWLIALLITFFTVGAQTTKRIRDQEKATALQTAEMVAEAPMTA 60 



Query: 63 QTLKDNHYDQSLQKQMLFLSKKSNLDYIVLINLKGIRFTHPDSTKIGKPFQGGDEQAVFK 122 

L+ + LQ + K + +++V++++ GIR THPD +KIGK F+GGDE V K 

Sbjct: 61 AALESGKKQKELQSYTKRVQKITGTEFVVVMDMNGIRKTHPDPSKIGKKFRGGDESEVLK 120 



50 Query: 123 GKAIMSTAEGSLGKSLRYLIPVY-DHQKQVGAIAVGLKLTTLGDLSQSSIKEFSKPLLIS 181 

G +STA G+LGKS R +PVY ++ KQVGA+AVG+ + + ++ S++ + +S 

Sbjct: 121 GHVHISTASGTLGKSQRAFVPVYAENGKQVGAVAVGITVNEIDEVISHSLRPLYFIICVS 180 

Query: 182 ILISLWTSIISYGLKKQLHNLHPSDIFQHLEERNATLDQIQAAVFVIDQRHIIKRNNPA 241 
55 I + ++ I++ +K ++ L P +1 LEER+A L+ + + +D+ IK N 

Sbjct: 181 IFVGVIGAVIVARTVKNIMYGLEPYEIATLLEERSAMLESTKEGILAVDEHGKIKLANAE 240 

Query: 242 ASLLFKKEGQRDLFSGKLLESLIP- -QLKQDHFSKK- -TEQVLHFQGQDYLLSISPITVK 297 

A LF K G + ++ ++P +LK+ +KK ++ + G + + + PI +K 

60 Sbjct: 241 AKRLFVKMGINTNPIDQDVDDILPKSRLKKVIETKKPLQDRDVRINGLELVFNEVPIQLK 300 

Query: 298 TQmGYVVFLRNVTETLFTIlDQIlAHTTAYASALQAQTHQF^4NQLHVIYGLADIEYYDELK 357 

Q G + R+ TE +QL+ YA+AL+AQ+H+FMN+LHVI GL ++ YD+L 

Sbjct: 301 GQTVGAIATFRDKTEVKHLAEQLSGVKMYANALRAQSHEFMNKLHVILGLVQLKEYDDLG 360 
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Query: 358 lYLKELLEPQI^FLARLSMriTOEPRLASFIIGEREKFAEKHINLSTEIIiVEIPTKSTVED 417 

Y+K++ Q + + V+ LA F++G++ E+ NL E IP + 
Sbjct: 361 DYIKDIAIQQKSETSEIINDVKSSVLAGFLLGKQSFIREQGANLDIBCNGVIPNAaDPSV 420 

Query: 418 VNNYL-LIiHRYINTKILTLLN-STTLVS,LRIj1WQNNLIETD 456 

++ + ++ IN + + + +++ + + N++++ + 

Sbjct: 421 IHELITIIGNLINNGLDAVADMPKKQITMSMRFHNSILDIE 461 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6947> which encodes the amino acid 
sequence <SEQ ID 6948>. Analysis of this protein sequence reveals the following: 

Possible site: 57 

>» Seems to have a cleavable N-term signal seq. 

INTEGRAL Likelihood =-10.03 Transmembrane 174 - 190 ( 170 - 195) 

Final Results 

bacterial membrane Certainty=0. 5012 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 
bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 236/488 (48%) , Positives = 337/488 (68%) , Gaps = 3/488 (0%) 



Query: 


3 


MKKKLSLWAFLSLILVTMTICIFSIFYYOTIHQSYRMVRVQEEKILKNTGYALSRNPQVI 


62 






MPfl?* T. TiQT.TTiVxM j. QJ-TTV J- xW xxx XX HP xTi XTR TiX X X 




Sbjct: 


1 


MKKPLRLWftSLSLILVSMIvVTTSLFyGIMLHDTHQSIKNQETHLLTo 


OU 


Query: 


63 


QTLKDNHYDQSLQKQMLFLSKKSNLDYIVLINLKGIRFTHPDSTKIGKPFQGGDEQAVFK 


122 






+ L +N + ++ NLDY+V++N+KGIR THP+ IGKPFQGGDE+AV 




Sb j ct : 


61 


ELLLNNQPNRKTTAYTNSIASIYNLDYVVVMNMKBIRLTHPNPKNIGKPFQGGDEEAV^ 


120 


Query: 


123 


GKAIMSTAEGSLGKSLRYLIPVYDHQKQVGAIAVGLKLTTLGDLSQSSIKEFSKPLLISI 


182 






GK ++STA+G+LGKSLRYL+PV+D KQ+GAIAVG+KLTTL D++ +S + ++ LL+ + 




Sb j ct : 


121 


GKKVISTAKGTLGKSLRYLVPVFDGDRQIGAIAVGIKLTTLNDVALTSKRNYTLSLLLCL 


180 


Query: 


183 


LISLVVTSIISYGLKKQLHNLHPSDIFQHLEERNATLDQIQAAVFVIDQRHIIKRNNPAA 


242 






LISL+VTS IS+ LK+QLH L PS+I+Q EERNA LDQI+AAVFV+D+ 1++ N A 




Sbjct: 


181 


LISLLVTSFISFRLKRQLHQLEPSEIYQLFEERHAMLDQIEAAVFWDKAGILQLCNQAG 


240 


Query: 


243 


SLLFKKEGQRDLFSGKLLESLIPQLKQDHFSKKTEQVLHFQGQDYLLSISPITVKTQNRG 


302 






L ++ Q +G L P + + EQ+ + +DYLL+ISPI VK +RG 




Sbjct: 


241 


QKLIARKCQLGKPTGNSENYLFPDFPKLSLQEGHEQLFRYGEEDYLIAISPICVKNDHRG 


300 


Query: 


303 


YWFLRNVTETLFTLDQLAHTTAYASALQAQTHQFMNQLHVIYGLADIEYYDELKIYLKE 


362 






+++F+R + + TLDQLA+TTAYASALQAQTH+FMNQLHVIYGL DI YYD+LKIYL 




Sbjct; 


301 


HIIFMREAVKAIDTLDQLAYTTAYASALQaQTHKEMNQLHVIYGLVDIAYYDQLKIYLDS 


360 


Query: 


363 


LLEPQNEFLARLSMLVREPRLASFIIGEREKFAEKHINLSTEILVEIPTKSTVEDVNNYL 


422 






+LEP+NE L LS+LV+EP LASF+IGE+EK+ E +++L ++L EIP +T +NN L 




Sbjct: 


361 


ILEPENEILTSLSVLVKEPLLASFLIGEQEKYQELNVHLKIDVLSEIPHSATKNQLNNGL 


420 


Query: 


423 


LLHRYINTKILTLLNSTTLVSLRimQNNLIETDYQWENEKWL-LNDYHQYFNDAYFQQL 


481 






+++R+I4.T +LT L +LV + QN+LI + + W+ L F+ YFQQL 




Sb j ct : 


421 


MIYRFIHTNLLTTLRPKSLVLSIQHDQNHLI - -SHYTLTDNWIDLERVQPIFDLPYFQQL 


478 


Query: 


482 


LVDSRATY 489 








L D+ + + 




Sbjct: 


479 


LTDTNSQP 486 





SEQ ID 9004 (GBS130d) was expressed in E.coU as a GST-fusion product. SDS-PAGE analysis of total 
cell extract is shown in Figure 123 (lane 8-10; MW 63kDa) and in Figure 184 (lane 4; MW 63kDa). It was 
also expressed in E.coU as a His-fiision product. SDS-PAGE analysis of total cell extract is shown in Figure 
123 Cane 11; MW 38kDa) and in Figure 181 Oane 7; MW 38kDa). 
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GBS130d-GST was purified as shown in Figure 237, lane 11. GBS130d-His was purified as shown in 
Figure 233, lane 9-10. 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2247 

A DNA sequence (GBSx2368) was identified in S.agalactiae <SEQ ID 6949> which encodes the amino 
acid sequence <SEQ ID 6950>. Analysis of this protein sequence reveals the following: 

Possible site: 51 

»> Seems to have no N-terrainal signal sequence 



INTEGRAL 


Likelihood 




11 


52 


Transtneitibrane 


364 


- 380 


( 353 


- 386) 


INTEGRAL 


Likelihood 




-9 


6e 


Transmembrane 


33 


- 49 


( 26 


- 57) 


INTEGRAL 


Likelihood 




-7 


80 


Transmembrane 


87 


- 103 


( 82 


- 105) 


INTEGRAL 


Likelihood 




-6 


85 


Transmembrane 


153 


- 169 


{ 144 


- 174) 


INTEGRAL 


Likelihood 




-4 


41 


Transmembrane 


301 


- 317 


( 300 


- 318) 


INTEGRAL 


Likelihood 




-2 


81 


Trsinsmembrane 


216 


- 232 


( 212 


- 235) 


INTEGRAL 


Likelihood 




-2 


39 


Transmembrane 


120 


- 136 


( 120 


- 136) 


INTEGRAL 


Likelihood 




-1 


65 


Transmembrane 


57 


- 73 


; 56 


- 73) 


INTEGRAL 


Likelihood 




-1 


17 


Transmembrane 


428 


- 444 


[ 428 


- 444) 


INTEGRAL 


Likelihood 




-0 


32 


Transmembrane 


276 


- 292 


( 276 


- 292) 



The protein has homology with flie following sequences in the GENPEPT database. 

>GP:AAB18291 GB:U35658 L-malate permease [Streptococcus bovis] 
Identities = 329/428 (76%) , Positives = 375/428 (86%) 

Query: 18 DLKAKLEHIKIGSVPLPVWCLALLILLAGFLQKLPVNMLGGFAVILTMGWFLGTIGASI 77 

D + KL +IGSV LPVY+ A +IL+ L++LPVNMLGGFAVILTMGW LGTIG +1 
Sbjct: 14 DWRNKLTKTRI6SVTLPVYLVTASIILVTALLEQLPVNMLGGFAVILTM6WLLGTIGGNI 73 

Query: 78 PGFKNFGGPAILSLLVPSILVFFNLINKNVLESTNMLMKQANFLYFYIACLVSGSILGMN 137 

P K+FGGPAILSLLVPSI+VFFNL+N+NVL+ST++LMKQANFLYFYIACLV GSILGMN 
Sbjct: 74 PILKHFGGPAILSLLVPSI^OTENLLNQNVLDSTDILMKQANFLYFyIACLVCGSILGMN 133 

Query: 138 RKMLIQGLLRMIFPMLLGMVCAMMVGTPVGVILGLEWRHTLFYIVTPVLAGGIGEGILPL 197 

RK+L+QGL+RMI PM LGM+ AM VGT VG H-LGL W+H+LFYIVTPVLAGGIGEGILPL 
Sbjct: 134 RKILVQGLMRMIVPMALGMIIAMGVGTLVGTLLGLGWKHSLFYIVTPVLAGGI6EGILPL 193 

Query: 198 SLGYSSITGVASEQLVAQLIPATIIGNFFAILCTALLNRLGEKKPHLSGQGQLVRLNKGE 257 

SLGYS+ITG+ SEQLV QLIPATIIGNFFAI+C+ LL+RLGEK+P LSGQGQL+++ + 
Sbjct: 194 SLGYSAITGLPSEQLVGQLIPATIIGNFFAIMCSGLLSRLGEKRPELSGQGQLIKITNSD 253 

Query: 258 DMSDIIADHSGPIDVKKMGGGVLTACSLFIFGHLLQQLTGFPGPVLMIVAAAILKYINVI 317 

D+SD + + PIDVK MG GVL AC+LFI G LLQ LTGFPGPVLMIV AA LKY+NV+ 
Sbjct: 254 DLSDALEEDKAPIDVKLMGAGVLIACTLFITGGLLQHLTGFPGPVLMIWAAFLKYLNW 313 

Query: 318 PRETQNGAKQLYKFISGNFTFPLMAGLGLLYIPLKDWATLSIQYFIWISWFTVISVG 377 

P+ETQ G+KQLYKFISGNFTFPLM GLG+LYIPLKDW LS QYF+WISWFTVI+ G 
Sbjct: 314 PKETQRGSKQLYKFISGNFTFPLMVGLGMLYIPLKDWGMLSWQYFVWISWFTVIATG 373 

Query: 378 FFVSRFLNMNPVEAGIISACQSGMGGTGDVAILSTADRMNLMPFAQVATRLGGAITVITM 437 

FFVSRF+NMNPVEA I+SACQSGMGGTGDVAILSTA+RM LMPFAQVATRLGGAITVITM 
Sbjct: 374 FFVSRPMNJfflJPVEAAIVfiACQSGMGGTGDVAILSTANRMTLMPFAQVATRLGGAITVITM 433 

Query: 438 TAILRMLF 445 

TAI RMLF 
Sbjct: 434 TAIFRMLF 441 



Final Results 



bacterial membrane 
bacterial outside 
bacterial cytoplasm 



- Certainty=0. 5607 (Affirmative) < suco 

- Certainty=0. 0000 (Not Clear) < suco 

- Certainty=0. 0000 (Not Clear) < suco 
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A related DNA sequence was identified in S. pyogenes <SEQ ID 695 1> which encodes the amino 
sequence <SEQ ID 6952>. Analysis of this protein sequence reveals the following: 

Possible site: 48 
>>> Seems to liave no N- terminal signal sequence 



INTEGRAL 


Likelihood 




•11 


89 


Tremsmembrane 


361 


- 377 


( 350 


- 383) 


INTEGRAL 


Likelihood 




-7.43 


Transmembrane 


84 


- 100 


79 


- 102) 


INTEGRAL 


Likelihood 




-6 


16 


Transmembrane 


150 


- 166 


137 


- 171) 


INTEGRAL 


Likelihood 




-4 


88 


Transmembrane 


30 


- 46 


( 24 


- 48) 


INTEGRAL 


Likelihood 




-4 


35 


Transmembrane 


299 


- 315 


( 297 


- 316) 


INTEGRAL 


Likelihood 




-4 


14 


Transmembrane 


117 


- 133 


( 115 


- 134) 


INTEGRAL 


Likelihood 




-3 


19 


Transmembrane 


54 


- 70 


( 51 


- 75) 


INTEGRAL 


Likelihood 




-2 


92- 


Transmembrane 


425 


- 441 


( 425 


- 442) 


INTEGRAL 


Likelihood 




-2 


81 


Transmembrane 


213 


- 229 


; 209 


- 232) 


INTEGRAL 


Likelihood 




-2 


44 


Transmembrane 


273 


- 289 


( 271 


- 290) 



Final Results 

bacterial membrane Certainty=0 . 5755 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty= 0.0000 (Not Clear) < suco 

The protein has homology with the following sequences in the databases: 

>GP:AAB18291 GB:U35658 L-malate permease [Streptococcus bovis] 
Identities = 344/443 (77%) , Positives = 394/443 (88%) , Gaps = 6/443 (1%) 



Query: 


4 


ISKKMPQKDLSEHSKAWQNR RIGSVPLPVYLVLATLILVTGWLQQLPVNMLGGFAV 


59 






+ KK+P +E W+N+ RIGSV LPVYLV A++ILVT L+QLPVNMLGGFAV 




Sbjct: 


1 


IffiKKLPATAANETD--WRNKLTKTRIGSVTLPVYLVTASIILVTALLEQLPVNMLGGFAV 


58 


Query: 


60 


ILTLGWLLGTIGATIPGLKHFGGPAILSLLVPSILVFFNLLNPNVLEATNVLMKQANFLY 


119 






ILT+GWLLGTIG IP LKHFGGPAILSLLVPSI+VFFNLLN NVL++T++LMKQANFLY 




Sbjct: 


59 


ILTMGWLLGTIGGNIPILKHFGGPAILSLLVPSIMVFFNLLNQNVLDSTDILMKQANFLY 


118 


Query: 


120 


FYIACLVCGSILGMNRKILIQGLFRMIIPMLLGMVCAM6VGTLVGVILGLDWQHTLFYW 


179 






FYIACLVCGSILGMNRKIL+QGL RMI+PM LGM+ AMGVGTLVG +LGL W+H+LFY+V 




Sbjct: 


119 


FYIACLVCGSILGMNRKILVQGLMRMIVPMALGMILAMGVGTLVGTLLGLGWKHSLFYIV 


178 


Query: 


180 


TPVLAGGIGEGILPLSLGYSAITGVGSEQLVAQLIPATIIGNFFAILCTALLNRFGEKHP 


239 






TPVLAGGIGEGILPLSLGYSAITG+ SEQLV QLIPATIIGNFFAI+C+ LL+R GEK P 




Sbjct: 


179 


TPVLAGGIGEGILPLSLGYSAITGLPSEQLVGQLIPATIIGNFFAIMCSGLLSRLGEKRP 


238 


Query: 


240 


SYSGQGQLVKIGHSEDMSDALKDNSGALDVKLMGAfiVLTACSLFIAGGLLQHLTDFPGPV 


299 






SGQGQL+KI +S+D+SnAL+++ +DVKLMGAGVL AC+LFI GGLLQHLT FP6PV 




Sbjct: 


239 


ELSGQGQLIKITNSDDLSDALEEDKAPIDVKLMGAfiVLIACTLFITGGLLQHLTGFPGPV 


298 


Query: 


300 


LMIILAAFLKYLNVIPQETQNGAKQLYKFISSNFTFPLMAGLGLLYIPLKEWATLSWQY 


359 






LMI++AAFLKYLNV+P+ETQ G+KQLYKFIS NFTFPLM GLG+LYIPLK+W LSWQY 




Sb j ct : 


299 


LMIWAAFLKYLNWPKETQRGSKQLYKFISGNFTFPLMV6LGMLYIPLKDWGMLSWQY 


358 


Query: 


360 


FIWISWLTWSVGFFVSRFLNMSPVEAAIISACQSGMGGTGDVAILSTADRMNLMPFA 


419 






F+WISW TV++ GFFVSRP+NM+PVEAAI+SACQSGMGGTGDVAILSTA+RM LMPPA 




Sb j ct : 


359 


FVWISWFTVIATGFFVSRFMNMNPVEAAIVSACQSGMGGTGDVAILSTANRMTLMPFA 


418 


Query: 


420 


QVATRLGGAITVITMTAILRIIF 442 








QVATRLGGAITVITMTAI R++F 




Sbjct: 


419 


QVATRLGGAITVITMTAIPRMLF 441 





An alignment of the GAS and GBS proteins is shown below. 

Identities = 356/419 (84%) , Positives = 385/419 (90%) 

Query: 27 KIGSVPLPVYVCIALLILLAGFLQKLPVNMLGGFAVILTMGWFLGTIGASIPGFKNFGGP 86 

+IGSVPLPVY+ LA LIL+ G+LQ+LPVNMLGGFAVILT+GW LGTIQA+IPG K+FGGP 
Sbjct: 24 RIGSVPLPVYLVLATLILVTGWLQQLPVNMLGGFAVILTLGWLLGTIGATIPGLKHFGGP 83 
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Query: 87 AILSLLVPSILVFBWiINKlWLESTNMMKQaNFLYFYIACLVSGSILGMNRKMLIQGLL 146 

AILSLLVPSILVFFNL+N NVLE+TN+LMKQANFLYFYIACLV GSILGMNRK4LIQGL 
Sbjct: 84 AILSLLVPSILVFFNLLNPNVLEATNVLMKQANFLYFYIACLVCGSILGMNRKILIQGLF 143 

5 Query: 147 RMIFPMLLGWOUmVGTFVGVILGLEWRIITLFyiWPVLAGGIGEGILPLSLGYSSIT^ 206 

RMI PMLLGMVC3UM VGT VGVILGL+W+HTLFY+VTPVTiAGGIGEGILPLSLGYS+ITG 
Sbjct: 144 RMIIPMLLGMVCSUflGVGTLVGVILGLDWQHTLFYVVTPVIiAGGIGEGILPLSM^ 203 

Query: 207 VASEQLVAQLIPATIIGNFFAILCTALLNRLGEKKPHLSGQGQLVRIJSIKGEDMSDIIADH 266 
10 V SEQLVAQLIPATIIGNFFAILCTALIiNR GEK P SGQGQLV++ EDMSD + D+ 

Sbjct: 204 VGSEQLVAQLIPATIIGNFFAILCTALUJRFGEKHPSYSGQGQLVKIGHSEDMSDALKDN 263 

Query: 267 SGPIDVKKMGGGVLTACSLFIFGHLLQQLTGFPGPVLMIVAAAILKYIIWIPRETQNGAK 326 
SG +DVK MG GVLTACSLFI G LLQ LT FPGPVI1MI+ AA LKY+NVIP+ETQNGAK 
15 Sbjct: 264 SGALDVKIMGAGVLTACSLFIAGGLLQHLTDFPGPVLMIirAAFLKYIOTIPQETQISK^ 323 

Query: 327 QLYKFISGNFTFPLMAGLGLLYIPLKDWATLSIQYFIWISWFTVISVGFFVSRFMM 386 

QLYKFIS NFTFPLMAGLGLLYIPLK+WATLS QYFIWISW TV+SVGFFVSRFLNM 
Sbjct: 324 QLYKFISSNFTFPLMAGLGLLYIPLKEWATLSWQYFIWISWLTWSVGFFVSRFLNM 383 

20 

Query: 387 NPVEAGIISACQSGMGGTGDVAILSTADRMNMPFAQVATRICGAIWITMTAILR^ 445 

+PVEA IISACQSGMGGTGDVAILSTADRMNLMPFAQVATRLGGAITVITMTAII1R++F 
Sbjct: 384 SPVEAAIISACQSGMGGTGDVAILSTADRMNLMPFAQra.TRLGGAIWITOT 442 

25 Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2248 

A DNA sequence (GBSx2369) was identified in S.agalactiae <SEQ ID 6953> which encodes the amino 
acid sequence <SEQ ID 6954>. This protein is predicted to be malic enzyme (mae). Analysis of this protein 
30 sequence reveals the following: 

Possible site: 48 

»> Seems to have no N- terminal signal sequence 

INTEGRAL Likelihood = -2.28 Transmembrane 164 - 180 ( 164 - 181) 

35 -. Final Results 

bacterial membrane Certainty=0 . 1914 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

40 The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAB07709 GB:U35659 malic enzyme [Streptococcus bovis] 
Identities = 285/386 (73%), Positives = 332/386 (85%), Gaps = 1/386 (0%) 

Query: 2 SENLGQLAINQARENGGKLEVISKVKVEDKRDLSIAYTPGVASVSSAIAEDVELAYELTT 61 
45 ++++ +riAI QA++ GGKLEV KV +E K DL lAYTPGVA+VSSAI E E AYELTT 

Sbjct: 3 TKDVKELAIEQAKKPGGKLEVCPKVPIETKADLGIAYTPGVAAVSSAIYEKKERAYELTT 62 

Query: 62 KKNTVAVVSDGSAVLGLGDIGPEAAMPVNIEGKAALFKRFANVDAVPIVLK™^ 121 
KKNTVAV+SDGSAVLGLG+IGPEAAMPVMEGKAALFKRFA VD++P+VL T DTEEII 
50 Sbjct: 63 KKimaVISDGSAVLGLGNIGPEAAMPVMEGKftALFKRFAGVDSIPLVLDTQDT^ 122 

Query: 122 VKAISPTFGGINLEDISAPRCFEIEQRLIEECDIPVFHDDQHGTAIWLAALFNSLKLVK 181 

VK ++PTFGG1NLEDISAPRCFEIEQRLI+E DIPVFHDDQHGTAIWLAAL+NSLKL+ 
Sbjct: 123 VKFLAPTFGGINLEDISAPRCFEIEQRLIDELDIPVFHDDQHGTAIWLAALYNSLKLIN 182 

55 

Query: 182 KDIEDIRVWNGGGSAGLSITRKLLSAGAKHVTWDRFGIINDKDRESLAPHHKAIAKLT 241 

K lEDI W+NGGGSAGLSITRK L+AG KH+ +VDR GI+++ D +L PHH lAKLT 
Sbjct: 183 KKIEDIHWINGGGSAGLSITRKFLAAGVKHIIIVDRTGILSETD-TALPPHHAEIAKLT 241 

60 Query: 242 NREFQSGSLEDALENADVFIGVSAPEALHAEWISKMADKPIVFAMANPIPEIYPDQALKA 301 

NRE ++G L ALE ADVF+GVSAP L EWI +M ++P++FAMaNP+PEI+PD+AL A 



wo 02/34771 



PCT/GBOl/04789 



-2539- 



Sbjct: 242 OTlEHRTGDLKTMBGimVFVGVSRPGVLKPEWIQ^TOEQPVIFMSiroVPEIFPDE^^ 301 

Query: 302 GAYIVGTGRSDFPNQINNVLAFPGIFRGALDARAKTITVEMQIAAARGIASLIPEEELST 361 

GAYIVGTGRSDFPNQINNVLAFPGIFRGALDARAK IT+EMQIAAA+GIA LIP+ EL+ 
Sbjct: 302 GAYIVGTGRSDFPNQINNVIAFPGIFRGALDiUlAKKITIEMQIAaAKGIAKLIPDN^^ 361 

Query: 362 THIIPNAFQNDVADWAKSVSNAVQK 387 

T+IIP+ FQ VA WA+SV NAV++ 
Sbjct: 362 TNIIPOPFQEGVAKWaESVENAVKE 387 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6955> which encodes the amino acid 
sequence <SEQ ID 6956>. Analysis of this protein sequence reveals the following: 

Possible site: 48 
»> Seems to have no N- terminal signal sequence 

INTEGRAL Likelihood = -2.44 Transmembrane 164 - 180 ( 164 - 181) 
INTEGRAL Likelihood = -1.75 Transmembrane 94 - 110 ( 94 - 110) 



The protein has homology with the following sequences in the databases: 

>GP:AAB07709 GB:U35659 malic enzyme [Streptococcus bovis] 
Identities = 289/379 (76%) , Positives = 334/379 (87%) , Gaps = 1/379 (0%) 

Query: 7 QLALEQAKTFGGKXiEVQPKVDIKTKHDLSIAYTPGVaSVSSAIAKDKTLAYDLTTKKNTV 66 

+LA+EQAK FGGKLEV PKV I+TK DL lAYTPGVA+VSSAI + K AY+LTTKKNTV 
Sbjct: 8 ELAIEQAKKFGGKLEVCPKVPIETKADK3IAYTPGVAAVSSAIYEKKERAYELTTKKNTV 67 

Query: 67 AVISDGTAVLGLGDIGPEaaMPVMEGKAALFKAFAGVDAIPIVLDTKDTEEIISIVKALA 126 

AVISDG+AVLGLG+IGPEAAMPVMEGKAALFK FAGVD+IP+VLDT+DTEEII VK LA 
Sbjct: 68 AVISDGSAVLGLGNIGPEAAMPVMEGKAALFKRFAGVDSIPLVLDTQDTEEIIQTVKFLA 127 

Query: 127 PTFGGINLEDISAPRCFEIEQRLIKECHIPVPHDDQHGTAIWLAAIFNSLKLLKKSLDE 186 

PTFGGINLEDISAPRCFEIEQRLI E IPVFHDDQHGTAIWLAA++NSLKL+ K +++ 
Sbjct: 128 PTFGGINLEDISAPRCFEIEQRLIDELDIPVFHDDQHGTAIWLAALYNSLKLINKKIED 187 

Query: 187 VSIVTOGGGSAGLSITRKLLAAGATKTOATOKFGIINEQEAAQLAPHHIillAKVTNREFK 246 

+ +V+NGGGSAGLSITRK LAAG + +VD+ ei++E + A L PHH +IAK+TNRE + 
Sbjct: 188 IHWINGGGSftGLSITRKFLAAGVKHIIIVDRTGILSETDTA-LPPHHAEIAKLTNREHR 246 

Query: 247 SGTLEDALEGADIFIGVSAPGVLKAEWISKMAARPVIFAMANPIPEIYPDEALEAGAYIV 306 

+G h ALEGAD+F+GVSAPGVLK EWI +M +PVIFAMANP+PEH-PDEAL AGAYIV 
Sbjct: 247 TGDLATALEGRDVFVGVSAPGVLKPEWIQQMNEQPVIFAMANPVPEIFPDEALAaGAYIV 306 

Query: 307 GTGRSDFPNQINNVLAFPGIFRGALDARAKTITVEMQIAAAKGIASLVPDDALSTTNIIP 366 

GTGRSDFPNQINNVLAFPGIFRGALDARAK IT+EMQIAAARGIA L+PD+ L+ TNIIP 
Sbjct: 307 GTGRSDFENQIHNVljRFPGIFRGZiLDARAKKlTIEMQIARBBSIAKLIPDNELTPTNIIP 366 

Query: 367 DAFKEGVAEIVAKSVRSW 385 

D F+EGVA++VA+SVR+ V 
Sbjct: 367 DPFQEGVAKWAESVSNAV 385 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 306/387 (79%) , Positives = 349/387 (90%) 

Query: 1 MSENLGQLAINQARENGGKLEVISKVKVEDKRDLSIAYTPGVASVSSAIAEDVELAYELT 60 

M LGQLA+ QA+ GGKLEV KV ++ K DLSIAYTPGVASVSSAIA+D LAY+LT 
Sbjct: 1 MKNQLGQLALEQAKTFGGKIBVQPKVDIKTKHDLSIAYTPGVASVSSAIAKDKTLAYDLT 60 

Query: 61 TKKNTVAWSIX3SAVLGI/3DIGPEAAMPVMEGKAALFKRFANVDAVPIVLKI1^ 120 

TKKNTVAV+SDG+AVLGLGDIGPEAAMPVMEGKAALFK FA VDA+PIVL T DTEEIIS 
Sbjct: 61 TKKIOTAVISDGTAVLGLGDIGPEAAMPVMEGKAALFKAFAGVDAIPIVLDTKDTEEIIS 120 



Final Results 



bacterial membrane 
bacterial outside 
bacterial cytoplasm 



- Certainty=0, 1977 (Affirmative) < suco 

- Certainty=o . 0000 (Not Clear) < suco 

- Certainty=0 . 0000 (Not Clear) < suco 
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10 



20 



Query: 


121 


Sbjct: 


121 


Query: 


181 


Sb j Ct : 


181 


Query: 


241 


Sbj ct: 


241 


Query: 


301 


Sbjct: 


301 


Query: 


361 


Sbjct: 


361 



IVKA++PTFGGINIiEDISAPRCFEIEQRLI+EC IPVFHDDQHGTAIVVLAA+ETSISLKL+ 



KK ++++ +VVNGGGSaGLSITRKLL+AGA VTVVD+FGIIN+++ LAPHH IAK+ 



TNREF+SG+LEDALE AD+FIGVSAP L AEWISKMA +P+ + FAMANPIPEIYPD+AL+ 



15 AGAYIVGTGRSDFENQINimiAFPGIFRC3aiJ3ARAKTITVEMQiaaA+GIASL+P+ LS 



TT+IIP+AF+ VA++VAKSV + V K 



Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2249 

25 A DNA sequence (GBSx2370) was identified in S.agalactiae <SEQ ID 6957> which encodes the ammo 
acid sequence <SEQ ID 695 8>. This protein is predicted to be Bta. Analysis of this protein sequence 
reveals the following: 

Possible site: 19 

»> Seems to have no N-terminal signal sequence 
30 INTEGRAL Likelihood = -2.02 Transmembrane 29 - 45 ( 29 - 45) 

Pinal Results 

bacterial membrane Certainty=0. 1808 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

35 bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAD56628 GB:AF165218 Bta [Streptococcus pneumoniae] 
Identities = 35/112 (31%) , Positives = 63/112 (56%) 

40 ^ 

Query: 1 MYSFEELLATMTLITAAEIEDKIDSNQDFVLFIGRISCPFCHLFVPKIVEVADEDEFELF 60 

MF + ++ + T +++D+ FIGR +CP+C F + V E + ++ 
Sbjct: 1 MEQFLDNIKDLEVTTWRAQERLDKKETATFFIGRRTCPYCRKFAGTLSGVVAETKAHIY 60 

45 Query: 61 HLDSEDPDHWTANKEFRNKXDIPTVPGIJWVKNGTIKWKCDSKMTKEEIREF 112 

++SE+ + FR++y IPTVPG + + +G I V+CDS M+ +EI++F 

Sbjct: 61 FINSEEASQLNDLQAFRSRYGIPTVPGFVHITDGQINVRCDSSMSAQEIKDF 112 

A related DNA sequence was identified in S. pyogenes <SEQ ID 695 9> which encodes the amino acid 
50 sequence <SEQ ID 6960>. Analysis of this protein sequence reveals the foUovsdng: 

I Possible site: 25 

»> Seems to have no N-terminal signal sequence 

Final Results 

55 bacterial cytoplasm Certainty=0. 0900 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty= 0.0000 (Not Clear) < suco 



An alignment of the GAS and GBS proteins is shown below. 
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Identities = 39/111 (35%) , Positives = 66/111 (59%) 

Query: 3 SFEELLATMTLITAAEIEDKIDSNQDFVLFIGRISCPFCHLFVPKIVEVADEDEFELFHL 62 

+FEE++A + AE+ I S +D ++F+GR SCP+C F PK+ +VA +++ E++ + 

Sbjct: 11 TFEEIVANFIPSSVaEVTSAIASGKDMIVFLGRSSCPYCRRFAPKLAQVATDNQKEVyFV 70 

Query: 63 DSEDFDHWTANKEFRNKYDIPWPGLMVVKNGTIKVKCDSKMTKEEIREFI 113 

DSE+ FR Y + TVP L+V + + CDS +T ++I F+ 

Sbjct: 71 DSENaADAAELAAFRENYQLVTVPALLVSYDQHQRAVCDSSLTPDDIIAFL 121 

SEQ ID 6958 (GBS427) was expressed in E.coli as a His-fusion product. SDS-PAGE analysis of total cell 
extract is shown in Figure 80 (lane 5; MW 16.2kDa). 

GBS427-His was purified as shown in Figure 214, lane 8. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
15 vaccines or diagnostics. 

Example 2250 

A DNA sequence (GBSx2371) was identified in S.agalactiae <SEQ ID 6961> which encodes the amino 
acid sequence <SEQ ID 6962>. Analysis of this protein sequence reveals the following: 

Possible site: 26 
20 »> Seems to have an imcleavable N-term signal seq 

INTEGRAL Likelihood = -7.75 Transmembrane 2 - 18 ( 1 - 21) 

Final Results 

bacterial membrane — Certainty=0. 4100 (Affirmative) < suco 
25 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0.0000(Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9437> which encodes amino acid sequence <SEQ ID 9438> 
was also identified. 

30 The protein has homology with the following sequences in the GENPEPT database. 

>GP:BAA11328 GB:D78257 ORFll [Enterococcus faecalis] 
Identities = 36/80 (45%) , Positives = 58/80 (72%) 

Query: 1 MSLPIIMLWMVGMMFFMQRQQKKQAQERQKQLNAVQKGDEIVTIGGLFGWDEVNTEAQ 60 

35 ML +IML+V+V M F++ R QKKQ +ERQ LN +Q GD +VTIGGL GV+ E++++ + 

Sbjct: 1 MKLMLIMLLVIVAMYFYLFRTQKKQQKERQDFLNNLQPGDAWTIGGLHGVISEISSDKK 60 

Query: 61 RMVLDVDGVYLTFELAAIKS 80 
++ LD +G + F+ +I++ 
40 Sbjct: 61 KVTLDCEGAFFDFDQQSIRT 80 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6963> which encodes the amino acid 

sequence <SEQ ID 6964>. Analysis of this protein sequence reveals the following: 

Possible site: 60 
45 »> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -6.10 Transmembrane 3 - 19 ( 1 - 22) 
INTEGRAL Likelihood = -3.03 Transmembrane 63 - 79 ( 63 - 79) 

Final Results 

50 bacterial membrane Certainty=0 . 3442 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintyi=0. 0000 (Not Clear) < suco 



The protein has homology with the following sequences in the databases: 
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>GP:BaA11328 GB:D78257 ORFll [Enterococcus faecalis] 
Identities = 29/75 (38%) , Positives = 52/75 (68%) 

Query: 6 ILMITAmLGLIWFMQRQQKKQAQERQNQriIlA.IEKGDEVVTIGGMFAIVDEVDTTAKKIVL 65 
5 ++M +V++ + +++ R QKKQ +ERQ+ IiN ++ GD WTIGG+ ++ E+ + KK+ L 

Sbjct: 5 LIMLLVIVaMyFYLFRTQKKQQKERQDPIJSNLQPGmVVTIGG^ 64 

Query: 66 DVDGVFLTFELLRIK 80 
D +G F F+ +1+ 
10 Sbjct: 65 DCEGAFFDFDQQSIR 79 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 63/90 (70%) , Positives = 80/90 (88%) 

15 Query: 4 PIIMLVVrWGmFFMQRQQKKQAQERQKQIjNA.VQKGDEIVTIGGLFGVVDEVNTEAQRMV 63 

PI+M WM+G+++FMQRQQKKQAQERQ QIiNA++KiGDE+VTIGG+F +VDEV+T A+++V 
Sbjct: 5 PILMFVVMIiGLIWFMQRQQKKQAQERQNQimiEKGDEVVTIGGMPAIVDEVDTTAKKIV 64 

Query: 64 LDVDGVYLTPELAAIKSWSKaATPTEPVE 93 

20 LDVDGV+LTFEL AIK +V+KA T T VE 

Sbjct: 65 LDVDGVFLTFELLAIKRIVTKATTETTLVE 94 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

25 Example 2251 

A DNA sequence (GBSx2372) was identified in S.agalactiae <SEQ ID 6965> which encodes the amino 
acid sequence <SEQ ID 6966>. Analysis of this protein sequence reveals the following: 



30 



35 



Possible site: 21 

>» Seems to have an uncleavable N-term signal seq 



Final Results 

bacterial mettibrane — Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0.0000(Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

40 Example 2252 

A DNA sequence (GBSx2373) was identified in S.agalactiae <SEQ ID 6967> which encodes the amino 
acid sequence <SEQ ID 6968>. Analysis of this protein sequence reveals the following: 

Possible site: 16 

>>> Seems to have no N-terrainal signal sequence 
45 INTEGRAL Likelihood = -1.38 Transmembrane 164 - 180 ( 164 - 180) 

Final Results 

bacterial membrane Certainty=0. 1553 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

50 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 



The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB61731 GB:AL133220 putative oxidoreductase . [Streptomyces 
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coelicolor A3 (2) ] 

Identities = 72/216 (33%) , Positives = 120/216 (55%) , Gaps = 1/216 (0%) 

Query: 14 AQALEARGQKLYSVaNRTYDKGLEFATKYGIQKVYDHIDQVFEDPEVDIIYISTPHNTHI 73 
5 A ++ ++ +VA+RT FA ++GI + Y + + D +VD++Y++TPH+ H 

Sbjct: 25 ADLVDLPDAETVVAVRSRTEASAOFAERFGIPRAYGGWETLARDEDVDVVYVaTPHSffl 84 

Query: 74 SPLRKaLRNGKHVLCEKSITUTSTELKEAIDLaETIffiVVLAEJiMTIFHMPIY^^ 133 

+ L G++VLCEK TrjN+ E E + LA N V L EAM ++ P+ R+LK LV 

10 Sbjct: 85 TAAGLCLEAGRNVLCEKPFTIMAREAAELVALARENGVFLMEAMWMYCNPLVRRLKELVA 144 

Query: 134 SGKLGPLKMIQMNFGSYKEYDMTNRFFSRDLAGGALLDIGVYALSCIRWFMSEAPHNITS 193 

G +G ++ +Q +FG + +R GGALLD+GVY +S + + E P ++ + 

Sbjct: 145 DGRIGETOSLQZmFGIAGPFPAAHRLRDPAQGGGALIiDUSVYPVSFAQIiLIiGE-PTDVAA 203 

15 

Query: 194 QVTFAPTGVDEQVGILLTNPANEMATVSLSLHAKQP 229 

+ + GVD Q G LL+ + +A++ S+ P 
Sbjct: 204 RAVLSEEGVDLQTGALLSYGNDALASIHCSITGGTP 239 

20 Based on tliis analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2253 

A DNA sequence (GBSx2374) was identified in S.agalactiae <SEQ ID 6969> which encodes the amino 
acid sequence <SEQ ID 6970>. This protein is predicted to be surface protein Rib. Analysis of this protein 
25 sequence reveals the following: 
Possible site: 45 

»> Seems to have no N-terminal signal sequence 

Final Results 

30 bacterial cytoplasm Certainty=0 .4957 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.pyogenes. 

35 Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2254 

A DNA sequence (GBSx2375) was identified in S.agalactiae <SEQ ID 6971> which encodes the amino 
acid sequence <SEQ ID 6972>. This protein is predicted to be surface protein Rib. Analysis of this protein 
40 sequence reveals the following: 

Possible site: 24 

>>> Seems to have no N-terminal signal sequence 

Final Results 

45 bacterial cytoplasm Certainty^O. 1892 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

t 

No corresponding DNA sequence was identified in S.pyogenes. 

50 Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 
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Example 2255 

A DNA sequence (GBSx2376) was identified in S.agalactiae <SEQ ID 6973> which encodes the amino 
acid sequence <SEQ ID 6974>. This protein is predicted to be a host cell surface-exposed lipoprotein. 
Analysis of this protein sequence reveals the following: 

5 Possible site: 38 

>» Seems to have an uncleavable N-term signal seq 

INTEGRAL Iiikelihood = -7.75 Transmembrane 9 - 25 ( 5 - 28) 

Final Results 

10 bacterial membrane Certainty=0. 4100 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9005> which encodes amino acid sequence <SEQ ID 9006> 
15 was also identified. Analysis of this protein sequence reveals the following: 

Lipop: Possible site: -1 Crend: 3 
SRCFLG: 0 

McG: Length of UR: 24 

Peak Value of UR: 2.84 
20 Net Charge of CR: 2 

McG: Discrim Score: 10.29 
GvH: Signal Score (-7.5): -4.34 

Possible site: 34 
»> Seems to have an uncleavable N-term signal seq 
25 Amino Acid Composition: calculated from 1 

ALOM program comt: 1 value: -7.75 threshold: 0.0 

INTEGRAL Likelihood = -7.75 Transmen±irane 5 - 21 ( 1-24) 
PERIPHERAL Likelihood = 13.31 86 
modified ALOM score: 2.05 
30 icml HYPID: 7 CFP: 0.410 

*** Reasoning Step: 3 

Final Results 

35 bacterial membrane Certainty=0. 4100 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

40 >GP:AAC03455 GB:AF020798 putative host cell surface-exposed 

lipoprotein [Streptococcus thermophilus bacteriophage TP-J34] 
Identities = 40/102 (39%) , Positives = 63/102 (61%) , Gaps = 10/102 (9%) 

Query: 101 KNALISAKIYSKTMNLSKQSIFEQLYSESPDKATHSDKFTKEESQYAIDHLKVDFKENAL 160 
45 + A+ AK Y+ T+++SK+ + QL S DK++++ S YA+++ +D+ + AL 

Sbjct: 51 RTAVSKAKQYASTVHMSKEELRSQLVS FDKYSQDASDYAVENSGIDYNKQAL 102 

Query: 161 ETAKSYQSSSSLSKEEIYKQLTSTLGDKFTNDEAQYAVDHLK 202 
E AK YQ + S+S + 1 QL S DKFT +EA YAV +LK 
50 Sbjct: 103 EKAKQYQDTLSMSPDAIRDQLVSF--DKFTQEEADYAVANLK 142 

Identities = 40/112 (35%) , Positives = 64/112 (56%) , Gaps = 9/112 (8%) 

Query: 41 KKAKIKENKTQKKIVKKRREYAKSGHMSKDSIIEKLKCTSKKXRQEDINFVIl^ 100 

+ ++ K K + V KA++YA + HMSK+ + +L K Y Q+ ++ + N +DY 
55. Sbjct: 40 QSSESKVPKEYRTAVSKAKQYASTVHMSKEELRSQLVSFDK-YSQDASDYAVENSGIDYN 98 

Query: 101 KNALISAKIYSKTMNLSKQSIFEQLYSESPDKATHSDKFTKEESQYAIDHLK 152 

K AL AK Y T+++S +1 +QL S DKFT+EE+ YA+ +LK 

Sbjct: 99 KQALEKAKQYQDTLSMSPDAIRDQLVS FDKFTQEEADYAVANLK 142 



60 



No corresponding DNA sequence was identified in S.pyogenes. 
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SEQ ID 9006 (GBS122) was expressed in E.coli as a His-fusion product. SDS-PAGE analysis of total cell 
extract is shown in Figure 38 (lane 6; MW 21.9kDa). 

GBS122-His was purified as shown in Figure 202, lane 8. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
5 vaccines or diagnostics. 

Example 2256 

A DNA sequence (GBSx2377) was identified in S.agalactiae <SEQ ID 6975> which encodes the amino 
acid sequence <SEQ ID 6976>. This protein is predicted to be transposase (orfA). Analysis of this protein 
sequence reveals the following: 

10 Possible site: 42 

»> Seems to have no N-termlnal signal sequence 

Final Results 

bacterial cytoplasm Certainty4=0. 2830 (Affirmative) < suco 

15 bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB90833 GB:AJ250837 hypothetical protein [Streptococcus dysgalactiae] 
20 Identities = 91/96 (94%) , Positives = 93/96 (96%) 

Query: 1 MSRK\niRHFTDDFKQQlVDLYNVGRKRSSLIKVyELTPSTFDroi^ 60 

MSRK+RRHPTDDFKQQIVDL™ GRKRSSLIK VELTPSTFDKWVRQAKTTGSFKS+nNL 
Sbjct: 1 MSRKIRRHFTDDFKQQIVDLYNRGRKRSSLIKEYELTPSTFDKWVRQAKTTGSFKSVDNL 60 

25 

Query: 61 TDEQRELIEbRKHNKELEMQLDILKQAAVIMAQKGK 96 

TDEQRELIELRK NKELEMOLDILKQAAVIMaQKiGK 
Sbjct: 61 TDEQRELIELRKRNKELEMQLDILKQAAVIMftQKGK 96 

30 Based on this analysis, it was predicted that fliis protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2257 

A DNA sequence (GBSx2378) was identified in S.agalactiae <SEQ ID 6977> which encodes the amino 
acid sequence <SEQ ID 6978>. This protein is predicted to be transposase (orfB). Analysis of this protein 
35 sequence reveals the following: 

Possible site; 16 

>>> Seems to have no N- terminal signal sequence 

Final Results 

40 bacterial cytoplasm Certainty=0. 2618 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9915> which encodes amino acid sequence <SEQ ID 9916> 
45 was also identified. 

A related GBS nucleic acid sequence <SEQ ID 9903> which encodes amino acid sequence <SEQ ID 9904> 
was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 
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10 



15 



20 



50 



>GP:CAB90834 GB:AJ250837 putative transposase [Streptococcus dysgalactiae] 
Identities = 243/259 (93%) , Positives = 250/259 (95%) 



MCRWIiN+P SSYYY+AVE VSE E EE+IK IFL+S++RYGSRKIKICI1NNEGITLSRRR 



IRRIMKRLNLVSVYQKATFKPHSRGKNEAPIPNHLDRQFK ERPLQALVTDLTYVRVGNR 



WAYVCLIIDLTOREIIGLSLGWHKTAELVRQRIQSIPY LTKVKMFHSDRGKEF+NQLID 



Query: 


1 


Sbjct: 


1 


Query: 


61 


Sb j ct : 


61 


Query: 


121 


Sbjct: 


121 


Query: 


181 


Sbjct: 


181 


Query: 


241 


Sbjct: 


241 



EILEAFGITRSLSQAGCPyDNAVAESTYRAFKIEFVYQETFQ LEELaLKTK YVHWWNY 



HRIHGSUSra-QTPMTKRLIA 



There is also homology to SEQ ID 32. 

25 Based on Ihis analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2258 

A DNA sequence (GBSx2379) was identified in S.agalactiae <SEQ ID 6979> which encodes the amino 
acid sequence <SEQ ID 6980>. This protein is predicted to be pXOl-128. Analysis of this protein sequence 
30 reveals the following: 

Possible site: 20 

»> Seems to have no N-terminal signal sequence 

Final Results 

35 bacterial cytoplasm Certainty=0 . 3684 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

40 >GP:7Uy332432 GB:AF065404 pXOl-128 [Bacillus anthracis] 

Identities = 45/69 (65%) , Positives = 52/69 (75%) 

Query: 17 MKKAGKSKRVIMETLGIKiniSQITimKWYElffiELYRFHQGVGK^ 76 
MKK SNR IME LGIKN SQI TWMKWY ++ YRF Q VGRQY+YGKG + LSE+EQ 
45 Sbjct: 1 MKKESYSITOTIMEKL6IKNVSQIKTWMKWYRTDQTYRFQQPVGKQYSYGRGPKELSELEQ 60 



Query: 77 LQLQVDLLK 85 

L+L+ LK 
Sbjct: 61 LRLENKHLK 69 



No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 
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Example 2259 

A DNA sequence (GBSx2380) was identified in S.agalactiae <SEQ ID 6981> which encodes the amino 
acid sequence <SEQ ID 6982>. This protein is predicted to be transposase. Analysis of this protein 
sequence reveals the following: 

5 Possible site: 25 

»> Seems to have an iincleavable N-term signal seq 

Final Results 

bacterial roenibrane CertaintysO . 0000 (Not Clear) < suco 

10 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytqplasm Certainty=0. 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 

No corresponding DNA sequence was identified in S.pyogenes. 

15 Based on this analysis, it was predicted that this protein and its epitopes, could be usefiil antigens for 
vaccines or diagnostics. 

Example 2260 

A DNA sequence (GBSx2382) was identified in S.agalactiae <SEQ YD 6985> which encodes the amino 
acid sequence <SEQ ID 6986>. This protein is predicted to be Lmb, Analysis of this protein sequence 
20 reveals the following: 

Possible site: 18 

»> May be a lipoprotein 

Final Results 

25 bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=Q ,0000 (Not Clear) < suco 

A related DNA sequence was identified in S.pyogenes <SEQ ID 1595> which encodes the amino acid 
30 sequence <SEQ ID 1596>. Analysis of this protein sequence reveals the following: 

Possible site: 18 

>» May be a lipoprotein 

Final Results 

35 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty*=0 . 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

40 Identities = 302/306 (98%) , Positives = 303/306 (98%) 

Query: 1 MKKOTFIiMftMWSLVMIAGCDKSOTPKQPTQGMSVVTSFYPMYflmTCEVSGDI^ 60 

MKK FFLMAMWSLVMIAGCDKBBNPKQPTCXSMSVVTSPYPMYAMTKEVSGDIiNDVRMIQ 
Sbjct: 1 MKKGFFLMMTVSLVMIMCDKSRNPKQPTQGMSVTOSPyPMYAMTKEVSGDIiNDVRMIQ 60 

45 

Query: 61 SGAGIHSFEPSVNDVAAIYDADLFVYHSHTLEAWARDLDPNLKKSKVNVFERSKPLTLDR 120 

SGAGIHSFEPSVNDVi«\IYDflDLFVraSHTLEAWARDLDPNLKKSKV+VFEASKPLTIiDR 
Sbjct: 61 SGZ«31HSFEPSVNDTffiMYDftDLFVYHSHTI^AWARDI£)PNLKKSKVI^^ 120 

50 Query: 121 VKGLEDMEVTQGIDPATLYDPHTVmJPTLfifiEERVNIJ^ 180 

VKGLEDMEVTQGIDPATLVDPHTWTDPVLAGEEAVNIAKELG LDPKHKDSYTK AKAFK 
Sbjct: 121 VKGLEIMEVTQGIDPATLTOPHTWTDPVIAGEEAVNIAKELGRLDPKHKDSYrKNRKAF 180 

Query: 181 KEAEQLTEEYTQKFKJCJRSKTFTOQHTAFSYLRKRFGLKQLGISGISPEQEPSPRQLKEX 240 



wo 02/34771 



-2548- 



PCT/GBOl/04789 



KEAEQLTEEYTQKFKKTOSKTFVTQHTAFSYLAKRFGLKQLGISGISPEQEPSPRQLKEI 
Sbjct: 181 KEAEQLTEEYTQKFKKVRSKTFVTQHTAFSYLAKRFGLKQLGISGISPEQEPSPRQLKEI 240 



5 



Query: 241 QDEVKEyNVKTIFAEDNVNPKIAHAIAKSTGAKVKTLSPLEAAPSGNKTYLENLRANLEV 300 

QDWKEYNVKTIFAEDimiPKIAHaiAKSTGAKVKTLSPLEaAPSGISIKTyM^ 
Sbjct: 241 QDFVKEyNVKTIFMDNVNPKIJmiAKSTGMVKTLSPLE)iU^PSGOTermEin^^ 300 



10 



Query: 301 LYQQLK 306 

LYQQLK 
Sbjct: 301 LYQQLK 306 



There is also homology to SEQ ID 4. 

SEQ ID 6986 (GBS189) was expressed in E.coli as a His-fiision product. SDS-PAGE analysis of total cell 
extract is shown in Figure 38 (lane 2; MW 35.2kDa). 

15 The GBS189-His fusion product was purified (Figure 204, lane 7) and used to immunise mice. The 
resulting antiserum was used for Westem blot (Figure 248A), FACS (Figure 248B), and in the in vivo 
passive protection assay (Table III). These tests confirm that the protein is immunoaccessible on GBS 
bacteria and that it is an effective protective immunogen. 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
20 vaccines or diagnostics. 

Example 2261 

A DNA sequence (GBSx2383) was identified in S.agalactiae <SEQ ID 6987> which encodes the amino 
acid sequence <SEQ ID 6988>. Analysis of this protein sequence reveals the following: 

Possible site: 46 
25 »> Seems to have no N-terminal signal sequence 



Final Results 



30 



bacterial cytoplasm Certainty=0 .4656 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside — Certainty= 0.0000 (Not Clear) < suco 



The protein has homology with the following sequences in the GENPEPT database. 



35 



>GP:AaB41455 GB:U34956 phosphoribosylf ormylglycinamidine synthase 
[Mycobacterium tuberculosis] 
Identities = 73/237 (30%) , Positives = 112/237 (46%) , Gaps = 25/237 (10%) 



Query: 43 GAGGVCVAIGELRD GLEIDLDKVPLKYQGIiNGTEIAISESQERMSVWGPSDVDAF 98 

G G+ A ELA G+ I LD VPL+ + + E+ SESQERM W P +VnAF 
Sbjct: 282 GGAGLSCATSELASAGDGGMTIQLDSVPLRAKEMTPAEVLCSESQERMCAWSPKNVDAF 341 



40 



' Query: 99 lAACNKENIDAVWATVTEKPNLVMTWNGETIVDLERCFLDTNG VRWVDAKW 152 

+A C K + A V+ VT+ L +TW+GET+VD+ + G V + 

■ Sbjct: 342 LAVCRKMEVIATOIGEVTIX3DRLQITWHGETVVr>VPPRTVAHE6PVYQRPVaRPDTQDAL 401 



45 



Query: 153 DKDLTVPEARTTSAETLEM)MLKyLSDIJSlHASQK6LQTIFDSSVGRSTV--NHPIGGRYQ 210 

+ D + +R + + L A +L +L + S+ + +D V +TV H GG + 
Sbjct: 402 NADRSAKLSRPVTGDELRATLLALLGSPHLCSRAFITEQYDRYVRGNTVLAEHADGGMLR 461 



50 



Query: 211 ITPTESSVQKLPVQYGVTTTASVMAQGYNPYIAEWSPYHGAAYAVIEATARLVATGA 267 

I ES+ + + V + +++ PY GA A+ EA + TGA 

Sbjct: 462 I--DESTGRGIAVSTDASGRYTLL DPYAGaQLALAEAYRNVAVTGA 505 



There is also homology to SEQ ID 982. 
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Based on this analysis, it was predicted that this protein and its epitopes, could be usefiil antigens for 
vaccines or diagnostics. 

Example 2262 

A DNA sequence (GBSx2384) was identified in S.agalactiae <SEQ ID 6989> which encodes the amino 
,5 acid sequence <SEQ ID 6990>. This protein is predicted to be 308 ribosomal protein SIX (rpsK). Analysis 
of this protein sequence reveals the following: 
Possible site: 37 

»> Seems to. have no N-terminal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0 . 0598 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. GOOD (Not Clear) < suco 

15 A related GBS nucleic acid sequence <SEQ ID 928 1> which encodes amino acid sequence <SEQ ID 9282> 
was also identified. A further related GBS nucleic acid sequence <SEQ ID 10919> which encodes amino 
acid sequence <SEQ ID 1 0920> was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 

>6P:CAB11918 GB:Z9gi04 ribosomal protein Sll (BSll) [Bacillus subtilis] 
20 Identities = 81/92 (88%) , Positives = 87/92 (94%) 

Query: 2 HGNAIAWSSAGMjGFKGSRKSTPFAAQMAaEaWUCSAQEHGLKTVEVTVKGPGSGRESAI 61 

HGNA++WSSAGALGF+GSRKSTPFAAQMafiE AAK + EHGLKT+EVTVKGPGSGRE+AI 
Sbjct: 40 HGNAISWSSAGALGFRGSRKSTPFAAQMAftETAAKGSIEHGLKTLEVTVKGPGSGREAAI 99 

25 

Query: 62 RALAAAGLEVTAIRDVTPVPHNGftRPPKRRRV 93 

RAL AAGLEVTAIRDVTPVPHNG RPPKRRRV 
Sbjct: 100 RALCJAAGLEVTAIRDVTPVPHNGCRPPKRRRV 131 

30 A related DNA sequence was identified in S.pyogems <SEQ ID 699 1> which encodes the amino acid 
sequence <SEQ ID 6992>. Analysis of this protein sequence reveals the following: 

Possible site: 47 

»> Seems to have no N-terminal signal sequence 

35 Final Results 

bacterial cytoplasm Certainty=0. 0945 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

40 An alignment of the GAS and GBS proteins is shown below. 

Identities = 92/93 (98%) , Positives = 93/93 (99%) 

Query: 1 MHGNALAWSSAGALGFKGSRKSTPFAAQMAAEAAAKSAQEHGLKTVEVTVKGPGSGRESA 60 
+HGNALAWSSAGAlfiFKGSRKSTPFAAQMAAEAAAKSAQEHGLKrVEVTVKGPGSGRESA 
45 Sbjct: 35 VHGNAJJOTSSAGALGFKBSRKSTPFAAQMAAEAAAKSAQEHGLKTVEVTVKGPGSGRESA 94 

Query: 61 IRALARAGLEVTAIRDVTPVPHNGRRPPKRRRV 93 

IRALRAAfBLEVTAIRDVTPVPHNGARPPKRRRV 
Sbjct: 95 IRALAAAGLEVIAIRDVTPVPHNGaRPPKRRRV 127 

50 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 
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Example 2263 

A DNA sequence (GBSx2385) was identified in S.agalactiae <SEQ ID 6993> which encodes the amino 
acid sequence <SEQ ID 6994>. Analysis of this protein sequence reveals the following: 

Possible site: 53 

»> Seems to have no N-tertninal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2551 (Mf irmative) < suco 

bacterial membrane — Certainty=0.0000(Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:BftB03881 GB:AP001507 I»JA-directed RNA polymerase alpha subunit 

[Bacillus halodurans] 
Identities = 190/314 (60%), Positives = 249/314 (78%), Gaps = 2/314 (0%) 



Query: 


1 


MIEFEKPIITKIDENKD--YGRFVIEPLERGYGTrLGNSLRRVLLSSLPGAAVTSIKIDG 


58 






MIE EKP+I 1+ ++D YG+FV+EPLERGYGTTLGNSLRR+LLSSLPGAAVTS++IDG 




Sbjct: 


1 


MIEIEKPVIETIEISEDAKYGKFVVEPLERGyGTlTiGNSLRRIIiLSSLPGftAVTSVQIDG 


60 


Query: 


59 


VLHEFDTIPGVREDVMQIIIJJVKBLAVKSYVEDEKIIELDVEGPAEITAGDILTDSDIEI 


118 






VLHEF TI GV EDV I+USI+K LA+K Y +++K +E+D +G +TAGD+ DSD+++ 




Sb j ct : 


61 


VIlHEFSTIEGVVEDVTTIVL^^^KQIJALKIYSDEDKTLEIDTQGEGVVTAfiDLTHDSDV^ 


120 


Query: 


119 


VNPDHYLFTIi^GHSLKATMTVAKimGYVPAEGNKKDDaPVGTlAVDSIYTPVKKVNyQV 


178 






+NPD ++ T+ G L+ +T + RGYVPAEGNK D+ +G + +DSIYTPV +VNyQV 




Sbjct: 


.121 


lOTDLHIATLTTGAHLRMRITAKRGRGYVPAEGNKSDEIAIGVIPIDSIYTPVSRVNYQV 


180 


Query: 


179 


EPARVGSNDGPDKLTIEI^mJGTIIPEDftLGLSARVLIEHIilS^^FTDLTEVAKAT^ 


238 






E RVG +DiajT+++ T+G+I PE+A+ L A++L EHLN+F LT+ A+ E+M E 




Sbjct: 


181 


ENTRVGQVTNYDKLTLDVWTDGSIRPEEAVSLGAKILTEHLNIFVGLTDQAQNAEI^^VEK 


240 


Query: 


239 


EKOTTOEIWLDRTIEELDLSVRSYNCLKRAGINTVFDLTEia'EPEMMKVRNLGRKSLEEVK 


298 






E+ EKVL+ TIEELDLSVRSYNCLKRAGINTV +LT+KTE +MMKVRNLGRKSLEEV+ 




Sbjct: 


241 


EEDQKEKVLEMTIEELDLSWSYNCLKRAGIimQELTQKTEEDMMKOTaSLGRKSIjEEVQ 


300 


Query: 


299 


IKLRDLGLGLKNDK 312 








KL +I1GLGL+ ++ 




Sbjct: 


301 


EKL6EIiQLG]:>RKEE 314 





A related DNA sequence was identified in S.pyogenes <SEQ ID 6995> which encodes liie amino acid 
sequence <SEQ ID 6996>. Analysis of this protein sequence reveals the following: 

Possible site: 53 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2551 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 305/312 (97%) , Positives = 311/312 (98%) 

Query: 1 MIEFEKPIITKIDENKDYGRFVIEPIiBRGYGTTIiGNSLRRVLLSSLPGaAVTSIKIDGVL 60 

MIEFEKPIITKIDENKDYGRFVIEPLERGYGTTLGNSLRRVLLSSLPGAAVTSIKIDGVL 
Sbjct: 1 MIEFEKPIITKIDENKDYSlFVIEPLERGYGTTLGNSIiRRVLLSSLPGAAVTSIKlDGVL 60 

Query: 61 HEFDTIPGVREDVMQIIUWKGLAVKSYVEDEKIIELDVEGPAEITAGDILTDSDIEXVN 120 

HEFDTIPGVREDVMQIILNVKGLAVKSYVEDEKIIEL+VEGPAE+TAGDILTDSDIE+VN 
Sbjct: 61 HEFDTIPGVREDVMQIILNVKGLAVKSYVEDEKIIELEVEGPAEVTAGDILTDSDIELVN 120 
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Query: 


121 


PDHYLFTIAEGHSIiKaTMTVIiKimGyVPiffiGinCKDDAPVGTIA^ 


180 






PDHYLFTIAEGHSL+ATMTVAK RGYVPAEGNKKDDAPVGTLAVDSIYTPVKKVNYQVEP 




Sbjct: 


121 


PDHYLFTIAEGHSLRATMTVAKKRGYVPAEGNKKDDAPVGTLAVDSIYTPVKKVNYQVEP 


180 


Query: 


181 


ARVGSlSroGFDKI.TIEIMTNGTIIPEDALGLSARVLIEHIJn:jFTDLTEVAKaTEVMKETEK 


240 






ARVGSNDGFDKLTIEIMTNGTIIPEDALGLSARVLIEHMiiFTDLTEVaKATEVMKETEK 




Sbjct: 


181 


ARVGSmXSFDKLTIEIMTOGTIIPEDALGLSARVLIEHIJinjFTDLTEVaKATEVMKETEK 


240 


Query: 


241 


VITOEKOT^DRTIEELDLSTOSYNCLKRAGIimrFDLTEKTEPEMMKVRNLGRKSLEEVKIK 


300 






VNDEKVIiDRTIEEIJDLSWSYNCLKRAGINTVFDLTEK+EPEMMKArailCR^^ 




Sb j ct : 


241 


VlTOEKVLDRTIEELDLSTOSYHCLKRRfiIimrFDLTEKSEPEmKVimLGRKSLEE\^^ 


300 


Query: 


301 


LADLGLGLKNDK 312 








lADLGLGIiKNDK 




Sbjct: 


301 


LADLGLQLKIODK 312 





Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2264 

A DNA sequence (GBSx2386) was identified in S.agalactiae <SEQ ID 6997> which encodes the amino 
acid sequence <SEQ ID 6998>. This protein is predicted to be SOS ribosomal proteia L17 (rplQ). Analysis 
of this protein sequence reveals the foUovwng: 

Possible site: 37 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 1609 (Affirmative) < suco 

bacterial membrane — Certainty=0.0000 (Not Clear) < suco 

bacterial outside Certaintys:0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>6P:CAB11920 6B:Z99104 ribosomal protein L17 (BL15) [Bacillus siibtilis] 
Identities = 95/128 (74%) , Positives = 105/128 (81%) , Gaps = 8/128 (6%) 

Query: 1 MAYRKLGRTSSQRKAMLRDLTTDLLINESIVTTEARAKEIRKTVEKMITLGKRGDLHARR 60 

M+YRKLGRTS+QRKAMLRDLTTDL+INE I TTE RAKE+R VEKMITLGKRGDLHARR 
Sbjct: 1 MSYRKLGRTSAQRKAMLRDLTTDLIINERIETTETRAKELRSWEKMITLGKRGDLHARR 60 

Query: 61 QAAAYVRNEIASENYDEASDKYTSTTALQKLFDDIAPRYAERNGGYTRILKTEPRRGDAA 120 

QAftAY+RNE+A+E ++ ALQKLF DIA RY ER GGYTRI+K PRRGD A 

Sbjct: 61 CJAAAYIRNEVftNEENNQ-- DALQKLFSDIATRYEERQGGYTRIMKLGPRRGDGA 112 

Query: 121 PMAIIELV 128 

PMAIIELV 
Sbjct: 113 PMAIIELV 120 

A related DNA sequence was identified in S.pyogenes <SEQ ID 6999> which encodes the amino acid 
sequence <SEQ ID 7000>. Analysis of this protein sequence reveals the following: 

Possible site: 37 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 .1609 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 125/128 (97%) , Positives = 127/128 (98%) 
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Query: 1 MAYRKLGRTSSQRKAMLRDLTTDLLINESIVTTEftRAKEIRKTVEKMIITOKRGDLHARR 60 

MAYRKLGRTSSQRKAMLRDLTTDLLIlffiSIVTTEARAKEIRKTVEKMITLGKRGDLHARR 
Sbjct: 1 MA.yRKIiGRTSSQRKaMLRDLTTDI.LIlffiSIVTTEARMEIRKTVEKMITW3KRGDIiH^ 60 

5 

Query: 61 QAAAYVRNEIASENYDEASDKYTSTTALQKLFDDIAPRYAERNGGYTRILKTEPRRGDAA 120 

QAftAYVRNEIASENTOEA+DKyTSTTALQKLF +IAPRYAERNGGYTRIIiICrEPRRGDAA 
Sbjct: 61 oaaayVRNEIASEINYDEATDKyTSTTALQKLPSEIAPRYilffiRNGGYTRILKTBPRRGn^ 120 

10 Query: 121 PMAIIELV 128 

PMAIIELV 
Sbjct: 121 PMAIIELV 128 

Based on this analysis, it was predicted that these proteins and their epitopes coijld be useful antigens for 
15 vaccines or diagnostics. 

Example 2265 

A DNA sequence (GBSx2396) was identified in S.agalactiae <SEQ ID 7001> which encodes the amino 
acid sequence <SEQ ID 7002>. This protein is predicted to be mercuric reductase. Analysis of this protein 
sequence reveals the following: 

20 Possible site: 35 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2384 (Affirmative) < suco 

25 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAA83977 GB:AF138877 mercuric reductase MerA [Bacillus sp. 
30 RC607] 

Identities = 29/33 (87%) , Positives = 32/33 (96%) 

Query: 4 VGLTEEQRKEKGYDVKTSVLPLXAVPRAIVNRE 36 
VGLTE+QaKEKGY+VKTSVLPL AVPRA+VNRE 
35 Sbjct: 520 VBLTEQQAKEKGyEVKrSVLPLDAVPRALVNRE 552 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, covild be useful antigens for 
vaccines or diagnostics. 

40 Example 2266 

A DNA sequence (GBSx2397) was identified in S.agalactiae <SEQ ID 7003> which encodes the amino 
acid sequence <SEQ ID 7004>. This protein is predicted to be mercuric reductase. Analysis of this protein 
sequence reveals the foUowitig: 

Possible site: 49 
45 >» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 3 016 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

50 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 



The protein has homology with the following sequences in the GENPEPT database. 

>GP:CftA70224 GB:Y09024 mercuric reductase [Bacillus cereus] 



wo 02/34771 



-2553- 



PCT/GBOl/04789 



Identities = 146/194 (75%) , Positives = 175/194 (89%) 



Query : 


2 


PQISGLEKMDYLTSTTLLELKKIPKRLTVIGSGYIGMELGQLFHHLGSEITLMQRSERIiL 


61 






P I GL ++DYLTST+LLEI1KK+PKRL VIGSGYIGMELGQLFH+LGSE+TL+QRSERLL 




Sbjct; 


226 


PNIPGIiNEVDYLTSTSLimKKVPKRLWIGSGYIGMELGQLFHlSMSEV^^ 


285 


Qusiry: 


62 


KEYDPEISESVEKRLIEQGINLVKGATFERVEQSGEIKRVWrViroSREVIESDQIiLVM 


121 






KEYDPEISESVEK+L+EOGINLVKBAT+ER+EO+G+IK+V+V VNG + +IE+DOLLVAT 




Sb j ct : 


286 


KEYDPEISESTOKSLVEQGIinjVKGATYERIEQNGDIKKVHVEVNGKKRIIEflDQLLVAT 


345 


Query: 


122 


GRKPNTDSLNLSAAGVETGKNNEILINDFGQTSNEKIYAAGDVTLGPQFVYVJiAYEGGII 


181 






GR PNT +LNL AAGVE G EI+I+D+ +T+N +lYftaGDVTLGPQFVYVAaY+GG+ 




Sbjct: 


346 


GRTPNTATIJILRaAGVEIGSRGEIIIDDYSRTmrRIYaAGDVTLGPQPVyvaAYQGGVA 


405 


Query: 


182 


TDNAIGGENKKIDIi 195 








NAIGGLNKK++L 




Sbjct: 


406 


APNAIGGIjNKKLNL 419 





There is also homology to SEQ ID 1820. 

20 Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2267 

A DNA sequence (GBSx2398) was identified in S.agalactiae <SEQ ID 7005> which encodes the amino 
acid sequence <SEQ ID 7006>. This protein is predicted to be triacylglycerol acylhydrolase. Analysis of this 
25 protein sequence reveals the following: 

Possible site: 46 

»> Seems to have no N-terminal signal sequence 

Final Results 

30 bacterial cytoplasm Certaintys=0. 3180 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 

35 No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2268 

A DNA sequence (GBSx2399) was identified in S.agalactiae <SEQ ID 7007> which encodes the amino 
40 acid sequence <SEQ ID 7008>. Analysis of this protein sequence reveals the following: 

Possible site: 42 

>» Seems to have no N-terminal signal sequence 

Final Results 

45 bacterial cytoplasm Certainty=0. 0544 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

50 5.GP:AAC74453 GB:AE000234 orf, hypothetical protein [Escherichia 

coli K12] 

Identities = 45/58 (77%) , Positives = 51/58 (87%) 



wo 02/34771 



-2554- 



PCT/GBOl/04789 



Query: 1 MPWQNLLHAGQENLFSGLTALTMFTVGEGKLMTHDEPCSMAPDDKHDLISGTCSHLP 58 

+PWQNLLHAG+ENLFSGLTAL+AEFT+GEG+LM HD P APD+ DLISGTCSHLP 
Sbjct: 34 LPWQNLLHAGEENLFSGLTALSAEFTIGEGELMaHDVPLGCAPDEmDLISGTCSHLP 91 

5 No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useftil antigens for 
vaccines or diagnostics. 

Example 2269 

A DNA sequence (GBSx2400) was identified in S.agalactiae <SEQ ID 7009> which encodes the amino 
10 acid sequence <SEQ ID 7010>. This protein is predicted to be transposase for insertion sequence element 
is5. Analysis of this protein sequence reveals the following: 

Possible site: 48 

»> Seems to have no N-terminal signal sequence 

15 Pinal Results 

bacterial cytoplasm — Certainty=0. 2058 (Affirmative) < suco 
bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

20 The protein has homology with the following sequences in the GENPEPT database. 

>GP:BAB15497 6B:AK026530 unnamed protein product [Homo sapiens] 
Identities = 297/299 (99%) , Positives = 297/299 (99%) 

Query: 1 MEQILPWQI^WEVIEPFYPKAGNGRRPYPLETMLRIHCMQHVirYNLSDGAMEDALYEIASM 60 
25 MEQILPWQNMVEVIEPFYPKA(aiGRRPYPLETMI«IHCMQHWYNLSDGAMEDALYEIASM 

Sbjct: 40 MEQlLPWQNMVEVIEPFYPKftGNGRRPYPLETMLRlHCMQHWYNLSDGRMEna^ 99 

Query: 61 RLFARLSLDSALPDRTTIMNFRHLLEQHQLARQLFKTINRWLAEAGVMMTQGTL 120 
RLFARLSrjJSALPDRTTIMNFRHLLEQHQI^QLFKTINRWLftEaiGVK^^ 
30 Sbjct: 100 RLFARLSLDSaLPDRTTIOTFRHia^EQHQMJlQLFKTINRWLREAGVMM^TLVDATII 159 



35 



Query: 121 EAPSSTKNKEQQRDPEMHQTKKGNQWHFGMKAHIGVDAKSGLTHSLVTTAANEHDLNQLX 180 

EAPSSTKNKEQQRDPEMHQTKKGNQWHFGMKAHIGVDAKSGLTHSLVTTAANEHDIiNQL 
Sbjct: 160 EAPSSTKHKEQQRDPEMHQTKKGNQWHFGMKAHIGVDAKSGLTHSLVTTAANEHDMQLG 219 

Query: 181 NLLHGEEQFVSflDfiXYQGAPQREEIAEVDVDWLIiffiRPGKWTLKQHPRKNKTAINIEYM 240 

HLLHGEEQFVSZyaA YQGAPQREELREVDVDVILIAERPGKVRTLKQHPRKNKrAINIEYM 
Sbjct: 220 NLLHGEEQFVSanftGYQGaPQREEIAEVDVDWLIAERPGKVRTLKQHPRKNKTAINIEYM 279 

40 Query: 241 KASIRARVEHPFRIIKRQFGFVKARYKGLLKNDKQLAMLFTLANLFRADQMIRQWERSH 299 

KASIRARVEHPFRIIKRQFGFVKiUlYKGLLKiroNQIiflMLFTLANLFRADQMIRQWERSH 
Sbjct: 280 KASIRARVEHPFRIIKRQFGFVKftRYRGLLKNDNQIAMLFTLftNLFRADQMIRQWE^ 338 

No corresponding DNA sequence was identified in S.pyogenes. 

45 Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2270 

A DNA sequence (GBSx2401) was identified in S.agalactiae <SEQ ID 7011> which encodes the ammo 
acid sequence <SEQ ID 7012>. Analysis of this protein sequence reveals the following: 

50 Possible site: 16 

»> Seems to have a cleavable N-term signal seq. 

Final Results 

bacterial outside Certainty=0. 3000 (Affirmative) < suco 
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bacterial membrane Certainty=0.0000(Not Clear) < suco 

bacterial cytoplasm CertaintY=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CRB51958 GB:AL109661 putative eiokaryotic-type serine/threonine 
protein kinase [Streptomyces coelicolor A3 (2)] 
Identities = 49/169 (28%) , Positives = 90/169 (52%) , Gaps = 6/169 (3%) 

10 Query: 23 PTTIRTODVSNKTVaQZiKMTLENSGLKVQAIRNIESDSVSEGLVVKlI)PAftGRSI^^ 82 

P T+++PDV+ + +A+ LE+ GL+ G + SD V+ G V+ T P +G + R G+ 
Sbjct: 469 PDTVKI.PDVTGYKLDKARTLIJEDEGLEPG^OTRAFSDEVaRGFVISTKPGSGTTVRAGSA 528 

Query: 83 VNLYIATPNKSFTLGNYKEHNYKDILKDLQGKGVKKSLIKVKRKINNDYTTGTILAQSLP 142 
15 VL++ ++++ +L+G G+K + ++N++Y +G + A+ P 

Sbjct: 529 VAL-WSKGSPVDVPDVTGDDLDEARAELEGAGLK--VKTADERVNSEYDSGRV-ARQTP 584 

Query: 143 EGTSENPIX3NKKLTLTOAV]roPMI-MPDVTGMWGEVIETLTDIK3IJ3AD 190 
E +G+ +TLTV+ MI +PDV G+V+ + LDG + D 

20 Sbjct: 585 EPGGRAAEGD-TVTLTVSKBPRMIEVPDWGDSVDDAKQKLEDAGFEVD 632 

Identities = 45/161 (27%) , Positives = 80/161 (48%) , Gaps = 4/161 (2%) 

Query: 27 RVPDVSNKTVAQAKMTLENSGLKVGAIRNIESDSVSEGLWKTDPAAGRSRREGAKVNLY 86 
+VP + +KT AQA+ L+++GL VG +R+ SD+V G V+ TDP G R+ V+L 
25 Sbjct: 405 iCVPPLLSKTEAQARDRLDDAGLDVGKVRHAYSDTVERGKVISTDPGVGDRIRKNDSVSLT 464 

Query: 87 lATPNKSBTIXSmKEHlTYKDILKDLQGKGVKKSLIKVKRKINNDYTTGTILRQSLPBGTS 146 

++ + L + + L+ +G++ + V R +++ G +++ GT+ 

Sbjct: 465 VSIX3PDTVKLPDVTGYKLDK3iRTLLEDRGIEPGM--VTHAESDEVARGFVISTKPGSGTT 522 

30 

Query: 147 FNPDGNKKLTLTVAVNDPMIMPDVTGMTVGEVIETLTDLGL 187 

+ L V+ P+ +PDVTG + E L GL 
Sbjct: 523 VR--AGSAVaLWSRGSPVDVPDVTGDDLDEARAELEGA6L 561 

35 There is also homology to SEQ ID 3026. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useftil antigens for 
vaccines or diagnostics. 

Example 2271 

A DNA sequence (GBSx2402) was identified in S.agalactiae <SEQ ID 7013> which encodes the amino 
40 acid sequence <SEQ ID 7014>. Analysis of this protein sequence reveals the following: 

Possible site: 38 

>» Seems to have an uncleavable N-term signal seq 

Final Results 

45 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9311> which encodes amino acid sequence <SEQ ID 9312> 
50 was also identified. 

The protein has homology with flie following sequences in the GENPEPT database. 

>GP:AAB90561 GB:AE001058 glutamine ABC transporter, ATP-binding 
protein (glnQ) [Archaeoglobus fulgidus] 
Identities = 142/219 (64%) , Positives = 178/219 (80%) 



55 



Query: 1 MDIHQGEVWIIGPSGSGKSTFLRTMNLLEVPTKGTVTFEGIDITDKKNDIFKMREKMGM 60 

M + +GEWVIIGPSGSGKST LR +N LE PT G + +G+DIT+ K DI K+R+++G+ 
Sbjct: 24 MKVEKGEVWIIGPSGSGKSTLLRCINRLEEPTSGKILLDGVDITNSKIDINKVRQRIGI 83 
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Query: 61 VFQQEm,FPmTVLENITLSPIKTKGLSNLDAQTKAYEIiLEKVGLKEKflNTYPASLSGGQ 120 

VFQQFNLFP++T L+N+TL+PIK K +S +A+ LLEKVGL++KA+ YPA LSGGQ 

Sbjct: 84 VFQQFmPPHLTALQim'IAPIKIKKMSKREAEELGMRLLEKVGLEDKADYYPAQLSGGQ 143 

5 

Query: 121 QQRIAIARGLAmPDVLLFDEPTSALDPEMTOEVLTVMQDLAKSGMTIWIOT^ 180 

QQR+AIAR LAMNP+V+LFDE TSALDPE+V EVL VM+ IiA+ GMTMV+VTHEMGFARE 
Sbjct: 144 QQRVAIARAIMNPEVMLFDEVTSALDPELVKEVLDVMKQIARDGMTMVVVTHEMGFARE 203 

10 Query: 181 VADRVIFMDAGIIVEQGAPKEVFEQTICEIRTRDFLSKVL 219 

V DRVIFMD G+IVE+G P+++F K RTR FLS +L 
Sbjct: 204 VGDRVIFMDGGVIVEEGKPEQIFSNPKHERTRKFLSMIL 242 

There is also homology to SEQ ID 1186. 

15 Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2272 

A DNA sequence (GBSx2403) was identified in S.agalactiae <SEQ ID 7015> which encodes the amino 
acid sequence <SEQ ID 7016>. This protein is predicted to be 4-hydroxy-2-oxoglutarate aldolase (kdgA). 
20 Analysis of this protein sequence reveals the following: 

Possible site: 43 

>» Seems to have no N- terminal signal sequence 

Final Results 

25 bacterial cytoplasm Certainty=0 . 1479 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty^O . 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

30 >GP:CAB14127 GB:Z99115 deoxyphosphogluconate aldolase [Bacillus subtilis] 

Identities = 21/62 (33%) , Positives = 38/62 (60%) , Gaps = 4/62 (6%) 

Query: 3 QLMQGKIVAVIRGNSQEEAFQAAQACIKGGISAIEIAYTNSKASQVIEQLVTQYTNQEQV 62 
+1) + K++AVIR ++EA Q ++ + GI A+E+ YT AS +IE + N+B + 

35 Sbjct: 9 RIiKEAKLIAVIRSKDKQEACQQIESLIiDKGIRAVEVTYTTPGRSDIIE SFBNREDI 64 

Query: 63 W 64 
++ 

Sbjct: 65 LI 66 

40 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2273 

A DNA sequence (GBSx2405) was identified in S.agalactiae <SEQ ID 7017> which encodes the amino 
45 acid sequence <SEQ ID 7018>. This protein is predicted to be H repeat-associated protein (rfbQRS) 
(bl458). Analysis of this protein sequence reveals the following: 

Possible site: 27 

»> Seems to have no N-terminal signal sequence 

50 Final Results 

bacterial cytoplasm Certainty=0, 0207 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 
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There is homology to SEQ ID 504. 

Based on this analysis, it was predicted that this protein and its epitopes, could be usefiil antigens for 
vaccines or diagnostics. 

Example 2274 

5 A DNA sequence (GBSx2406) was identified in S.agalactiae <SEQ ID 7019> which encodes the amino 
acid sequence <SEQ ID 7020>. Analysis of this protein sequence reveals the following: 

Possible site: 14 

»> Seems to have an iincleavatile N-term signal seq 

INTEGRAL Likelihood = -6.74 Transmembrane 2- 18( 1- 21) 
10 INTEGRAL Likelihood = -3.03 Transmembrane 73 - 89 ( 73 - 92) 

Final Results 

bacterial membrane Certainty=0. 3697 (Affirmative) < suco 

bacterial outside CertaintY=0 . 0000 (Not Clear) < suco 

15 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
There is also homology to SEQ ID 3376. 

Based on this analysis, it was predicted that tliis protein and its epitopes, could be useful antigens for 
20 vaccines or diagnostics. 

Example 2275 

A DNA sequence (GBSx2407) was identified in S.agalactiae <SEQ ID 7021> which encodes the amino 

acid sequence <SEQ ID 7022>. This protein is predicted to be insertion element ISl protein InsB (insB_5). 

Analysis of this protein sequence reveals the following: 

25 Possible site: 52 

»> Seems to have no N-temdnal signal sequence 

Pinal Results 

bacterial cytoplasm Certainty=0 .4280 (Affirmative) < suco 

30 bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useflil antigens for 
35 vaccmes or diagnostics. 

Example 2276 

A DNA sequence (GBSx2409) was identified in S.agalactiae <SEQ ID 7023> which encodes the amino 
acid sequence <SEQ ID 7024>. Analysis of this protein sequence reveals the following: 

Possible site; 13 
40 »> Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 3937 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

45 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
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No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2277 

5 A DNA sequence (GBSx2410) was identified in S.agalactiae <SEQ ID 7025> which encodes the amino 
acid sequence <SEQ ID 7026>. This protein is predicted to be triosephosphate isomerase (tpi). Analysis of 
this protein sequence reveals the following: 

Possible site: 53 

»> Seems to have no N- terminal signal sequence 
10 INTEGRJUj Likelihood = -0.37 TransmeTiibrane 35 - 51 ( 35 - 51) 

Final Results 

bacterial membrane Certainty=0 .1150 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 
15 bacterial cytoplasm Certainty=0.0000(Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAC43268 GB:U07640 triosephosphate isomerase [Lactococcus 
lactis] 

20 Identities = 50/75 (66%) , Positives = 61/75 (80%) 

Query: 6 IaGNWK^mNPEEAK^^IEAVaSKLPSSELVEflGIAAPMiTLSTVLEaAKGSELKIAaQN 65 

lAGNWKMNK EA+AF+EAV + I,PSS+ VE+ I APAL L+ + +6SELK+AA+N 
Sbjct: 7 OJMSNWKMimiiSEAQAFVEAVKNNLPSSniS^ 66 

25 

Query: 66 SYFENSGAFTGENSP 80 

SYFEN+GMTGENSP 
Sbjct: 67 SYFENAGAFTGENSP 81 

30 There is also homology to SEQ ID 6838:. 

Identities = 58/77 (75%) , Positives = 68/77 (87%) 

Query: 6 lAGIWKMNKlTPEEAKAFIEAVASKLPSSELVEAGIflAPALTLSTVLEAAKGSELKIAAQN 65 
IAGNWKMNKNP+EAKAF+EAVASKLPS++LV+ +AAPA+ L T +EAAK S LK+AAQN 
35 Sbjct: 7 IA(amKMNKNPQEJaC3\F^7EA^mSKLPSTDLVDVAV3««>AVDLVTTIEftAK^ 66 

Query: 66 SYFENSGAFTGENSPKV 82 

YPEN+GAFTGE SPKV 
Sbjct: 67 CYFENTC3AFTGETSPKV 83 

40 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2278 

A DNA sequence (GBSx2412) was identified in S.agalactiae <SEQ ID 7027> which encodes the amino 
45 acid sequence <SEQ ID 7028>. Analysis of this protein sequence reveals the following: 

Possible site: 20 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -2.39 Transmembrane 96 - 112 ( 96 - 112) 

50 Final Results 

bacterial membrane Certainty=0 . 1956 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 
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The protein has homology with the following sequences in the GENPEPT database. 

>GP:BAA14368 GB:D90354 surface protein antigen -precursor 
[Streptococcus sobrinus] 
Identities = 60/129 (46%) , Positives = 76/129 (58%) , Gaps = 18/129 (13%) 

5 

Query: 3 ISFDNSFLEWSDDSAFQftDVYLQMKRIAaGQVENTYLHTVNGYVISSISrrVVTHTPQPEE 62 

++F FL +VS DSAFQA+VYLQMKRIA G NTY++TVNG SSNTV T TP+P++ 
Sbjct: 1442 VTPKEDFLRSVSVDSAFQREVYLQMKRIAVGTFJaWYVNTVNGITYSSK^^ 1501 

10 Query: 63 PSPNQP TPPQPPIETIEPPVPASILENTGEQES LLGLIG- -AGILLGT 108 

PSP P P Q PP A LP TG+ + LLGL+ AG h 

Sbjct: 1502 PSPVDPKTTTTWFQPRQGKAYQPAPPAGAQ-LPATGDSSNAYLPLLGLVSLTAGFSL-- 1558 

Query: 109 AYGLKKKEE 117 
15 GL++K++ 

Sbjct: 1559 -LGLRRKQD 1566 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
20 vaccines or diagnostics. 

Example 2279 

A DNA sequence (GBSx2413) was identified in S.agalactiae <SEQ ID 7029> which encodes the amino 
acid sequence <SEQ ID 7030>. Analysis of this protein sequence reveals the following: 

Possible site: 23 
25 »> Seems to liave no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 3691 (Affirmative) < suco 
bacterial membrane — CertaintY=0 . 0000 (Not Clear) < suco 
30 bacterial outside — Certaintyi=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9359> which encodes amino acid sequence <SEQ ID 9360> 
was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 

35 >GP:CAB15793 GB:Z99123 phosphotransacetylase [Bacillus subtxlis] 

Identities = 131/221 (59%) , Positives = 169/221 (76%) , Gaps = 2/221 (0%) 

Query: 6 LVDPVILGKADEVHDSLARLGFVDQDYSIIDPEQYEKFEEMKEAFVEIRKGKATMEDADR 65 
+++P+++G +E+ L I DP YE E++ +AFVE RKGKAT E A + 

40 Sbjct: 41 VUIPIVIGNEI!ffiIQAKAKErJS^:lTLGGVKIYDPHTYEG^^^ 100 

Query: 66 LLKDVNYFGVMLVKMLADGMVSQAIHSTADTVRPALQIIKTKPGISRTSGVFLMin^^ 125 

L D NYFG MLV GLADG+VSGA HSTADTVRPALQIIKTK G+ +TSGVF+M R 
Sbjct: 101 ALLDENYFGTMbVYKGLRDGLVSGAAHSTADTVRPALQIIKTKEGVKKTSGVFIMARG-- 158 

45 

Query: 126 QERYIFADCAINIDPNAQELAEIAVNTADTAKIFDIDPKIAMLSFSTKGSAKAPQAEKVQ 185 

+E+Y+FADCAINI P++Q+LAEIA+ +A+TAK+FDI+P++AMLSFSTKGSAK+ + EKV 
Sbjct: 159 EEQYVFADCAINIAPDSQDLAEIAIESAOTAKMFDIEPRVAMLSFSTKGSAKSDETEKVA 218 

50 Query: 186 EAAKIAKDLSPELAVDGELQFDAAPVPETAEIKAENSDVAG 226 

+A KIAK+ +PEL +DGE QFDAAFVP AE KAP+S++ G 
■Sbjct: 219 DAVKIAKEKAPELTLDGEFQFDAAFVPSVAEKKAPDSEIKG 259 

A related DNA sequence was identified in S.pyogenes <SEQ ID 703 1> which encodes the amino acid 
55 sequence <SEQ ID 7032>. Analysis of this protein sequence reveals the following: 

Possible site: 34 
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»> Seeros to have no.N-terndnal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0 . 3182 (Affirmative) < suco 

bacterial meitibrane Certainty=0 .0000 (Not Clear) < suco 

bacterial outside — Certainty=0 . 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 181/227 (79%) , Positives = 211/227 (92%) 

Query: 1 MKFEGLVDPVILGKADKVHDSLftRIXSFVDQDYSIIDPEQYEKFEEMKEAFVEIRKGKATM 60 

+KFEGL++P+ILG+++EV + L +LGF DQDY+II+P +Y F++MKEAFVE+RKBKA.T+ 
Sbjct: 38 LKFEGIiIiEPIIIK3QSEEVRiniLTKLGFADQDYTIINElffiYADFDKMKEAFVEVRKGK^^ 97 

Query: 61 EDADRLLKDVNYFGVMLVKLGMI^GMVSGAIHSTADTVRPALQIIKTKPGISRTSGVFLM 120 

EI5AD++L+DVNYFGVMLVK+GIiADGMVSGAIHSTADTVRPALQIIICrKPGISRTSGVFLM 
Sbjct: 98 EDADKMLRDVlWFGVMDVKMGIiaDGMVSGaiHSTADTVRPAIiQI 157 

Query: 121 miEOTQERYIFMCAINIDPNAQELBEIAVOTADTAKIFDIDPKIJmSFSTKaSAK^ 180 

NRENT ERY+FADCAINIDP AQELAEIAVNTA+TAKIFDIDPKIAMLSFSTRGS KAPQ 
Sbjct: 158 NREOTSERYVTADGaiNIDPTAQELAEIAVCWflETAKIFDIDPKIAMLSFSTKiGSSK^ 217 

Query: 181 AEKVQEAAKIAKDLSPELAVDGELQFDAAFVPETAEIKAPNSDVAGK 227 

+KV+EA +IA L+P+LA+DGELQFDAAFVPETA IKAP+S VAG+ 
Sbjct: 218 VDKVREATEIATGIiNPDLRLDGELQFDAAFVPETAAIKAPDSAVAGQ 264 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2280 

A DNA sequence (GBSx2414) was identified in S.agalactiae <SEQ ID 7033> which encodes the amino 
acid sequence <SEQ ID 7034>. This protein is predicted to be lipopolysaccharide biosynthesis protein- 
related protein. Analysis of this protein sequence reveals the following: 

Possible site: 61 

>» Seems to have no N-te2munal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0 .4076 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 



The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAG19110 GB:AE005009 Vng0600c [Halobacterium sp. NRC-1] 
Identities = 57/176 (32%) , Positives = 86/176 (48%) , Gaps = 20/176 (11%) 

Query: 1 MKVLLYLEftEEYLKKSGIGRAIKHQEKALQIAGIDYTTNPT 41 

M+ I. YLEA E L+ G+ A Q AL+ ++ P 
Sbjct: 2 MRAIiNYLEftAEALR-GGMVTATNQQRAALETTDVEVVETPWRAGDPVRSIGSLAAGGSCF 60 

Query: 42 DDFDLVHIWTTYGIRSWLLMSKAKKTGKKVIMHGHSTEEDFRNSFIGSNLVSPLFKWYLCR 101 

FD+ H N G S + A++T +++H H T EDF SF GS+ ++P + YL 
Sbjct: 61 TAFDVAHCISniiVGPGSVAVARHARRTDTPLVLHaHLTREDFAQSFRGSSTIAPALEPYIiRW 120 

Query: 102 FYQKRDAIITPTDYSKQLIKAYGIKKPIFVXiSNGIDLSRYQXSEKKESAFHHYFHL 157 

FY +AD ++ P++Y+K +++AY + PI LSNG+DL Q E + R F L 
Sbjct: 121 FYSQflDLVLCPSEYTKDVLRAYPVDAPIRQLSNGVDLESMQGYESFRADTRARFDL 176 



There is also homology to SEQ ID 1220. 
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Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2281 

A DNA sequence (GBSx2415) was identified in S.agalactiae <SEQ ID 7035> which encodes the amino 
5 acid sequence <SEQ ID 7036>. Analysis of this protein sequence reveals the following: 

Possible site: 41 

>>> Seems to have no N-terminal signal sequence 

Final Results 

10 bacterial cytoplasm Certainty=0 .2625 (Affirmative) < suoo 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

15 >GP:AAC35010 GB:AF055987 intracellular a-atnylase [Streptococcus mutans] 

Identities = 27/46 (58%) , Positives = 33/46 (71%) 

Query: 1 MEVGEIYAGKrFVDYIiGNCEQEVVIGDDCSWGDFLVESASISAWVPK 46 

M +GE K FVDYL NO +EV++ D GWGDF V+ AS+SAWV K 
20 Sbjct: 438 MNMGEFNRNKVFVDYLNNCTEEVILDDQGWGDFPVQEASIjSAWVNK 483 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useftil antigens for 
vaccines or diagnostics. 

25 Example 2282 

A DNA sequence (GBSx2416) was identified in S.agalactiae <SEQ ID 7037> which encodes the amino 

acid sequence <SEQ ID 7038>. This protein is predicted to be RopA. Analysis of this protein sequence 

reveals the following; 

Possible site: 24 
30 »> Seems to have no N-terminal signal sequence 

Pinal Results 

bacterial cytoplasm Certainty=0. 2082 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

35 bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

There is also homology to SEQ ID 6908: 

Identities = 30/35 (85%) , Positives = 33/35 (93%) 

40 Query: 1 MEADQVRGLLSADMLKHDIAMKKAVDVITSSATVK 35 

M ADQTO LLSADMLKHDIAMKKRV+VITS+A+VK 
Sbjct: 422 MPADQVRSLLSADMLKHDIAMKKAVEVITSTASVK 456 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
45 vaccines or diagnostics. 

Example 2283 

A DNA sequence (GBSx2417) was identified in S.agalactiae <SEQ ID 7039> which encodes the amino 
acid sequence <SEQ ID 7040>. This protein is predicted to be DNA-directed RNA polymerase, subunit 
delta. Analysis of this protein sequence reveals the following: 
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Possible site: 54 

»> Seems to have no N-terminal signal sec[uence 

Pinal Results -- — 

bacterial cytoplasm — Certainty=0 . 2407 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>6P:C7ffll5744 GB:Z99123 RNA polymerase (delta subunit) [Bacillus siibtilis] 
Identities = 62/186 (33%) , Positives = 102/186 (54%) , Gaps = 15/186 (8%) 



Query: 


1 


MELEVFAGQEKSELSMIEVAEJAILEQRGRDNEMYFSDLVNDIQTYLGKSDSAIRESLPFF 


60 






M ++ ++ +E ■ E++++E+A + E+ + + F +I1+N+I + LG + + + F 




Sbjct: 


1 


MGIKQYSQEELKEMaLVEIAHELPEEHKKP--VPFQEIJJffiIASLLGVTCKEELG^ 


58 


Query: 


61 


YSDIOTDGSFIPLGENKWGLRSWYAIDEIDEEIITLEEDEDGAPICRKKKRVKAFMDGDED 


120 






Y+DLN DG F+ L + WGLRSWY D++DEE ■ K KKK+ ++ D D 




Sbjct: 


59 


YTDMIDGRFLALSDQTWGLRSWYPYDQLDEE TQPTVKAKKKKAKKAVEEDLD 


111 


Query: 


121 


AIDYNDDDPEDEDFTEETPSLEYDEENPDDEKSEVESYDSEINEIIPDEDLDEDVEINEE 


180 






++ + D +D D E L+ + ++ D+E + + D EI E I DED DED 




Sb j ct : 


112 


LDEFEEIDEDDLDIiDEVEBELDLEADDFDEEDLDEDDDDLEIEEDlIDED-DEDY 


165 


Query: 


181 


DDEEEE 186 








. DDEEEE 




Sbjct: 


166 


DDEEEE 171 





A related DNA sequence was identified in S.pyogenes <SEQ ID 7041> which encodes the amino acid 
sequence <SEQ ID 7042>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 2263 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 162/191 (84%) , Positives = 181/191 (93%) , Gaps = 1/191 (0%) 



Query: 


1 


MELEVFAGQEKSELSMIEVARAILEQRGRDNEMYFSDLVNDIQTYLGKSDSAIRESLPFF 


60 






++L+VFAGQEKSELSMIEVRRAIIiE+RGRDNEMyFSDLVN+IQ yLGKSD+ IR +LPFF 




Sbjct: 


12 


LKLDVPAGQEKSELSMIEVaRAILEERGRDNEMyFSDLVNEIQNYLGKSDAGIRHaLPFF 


71 


Query: 


61 


YSDIOTDGSFIPLGENKWGLRSWYAIDEIDEEIITLEEDEDGAPKRKKKRVNAFMDGDED 


120 






Y+DLNTDGSPIPLGENKWGLRSWYAIDEIDEEIITLEEDEDGA KRKKKRVNAFMDGDED 




Sbjct: 


72 


YTDIJmiGSFIPLGENKlWGLRSWYAIDEIDEEIITLEEDEDGAQKRKKKRVNAFMDGDED 


131 


Query: 


121 


AIDYNDDDPEDEDFTEETPSLEYDEENPDDEKSEVESYDSEINEIIPDEDLDEDVEINEE 


180 






AIDY DDDPEDEDFTEE+ +EyDEE+PDDEKSEVESYDSE+NEIIP++D E+V+INEE 




Sbjct: 


132 


AXDYKDDDPEDEDFTEESAEVEYDEEDPDDEKSEVESYDSEIiNEIXPEDDF-EEVDINEE 


190 


Query: 


181 










D+E+EE+EE V /\ 




Sbjct: 


191 


DEEDEEDEEPV 201 





Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 
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Example 2284 

A DNA sequence (GBSx2418) was identified in S.agalactiae <SEQ ID 7043> which encodes the amino 
acid sequence <SEQ ID 7044>. This protein is predicted to be CTP synthetase (pyrG). Analysis of this 
protein sequence reveals the following: 

Possible site: 23 

»> Seems to have an ijncleavable N-term signal seq 

INTEGRAL Likelihood = -0.11 Transmembrane 5 - 21 ( 5-21) 



The protein has homology with the following sequences in the GENPEPT database. 

>GP:C3iA09021 GB:AJ010153 CTP synthetase [Lactococcus lactis subsp. 
cremoris] (ver 2) 
Identities = 421/533 (78%) , Positives = 481/533 (89%) . 

Query: 2 TKYIFVTGGWSSIGKGIVAASLGRLLKNRGLKVTIQKFDPYINIDPGTMSPYQHGEVYV 61 

TKYIFVTGG SS+GKGIVAJVSLGRLLKNRGLKVT+QKFDPY+NIDPGTMSPYCJHGEV+V 
Sbjct: 3 TKYIFVTGGGTSSMGKBIVaaSLGRLLroniGLKVTVQKFDPYLIIIDPGTMSPYQHGEVFV 62 

Query: 62 TDDGAETDLDLGHYERFIDINLNKYSim'TGKIYSEVLKKERRGEYLGa.TVQVIPHVTDA 121 

TDDGAETDLDLGHYERFIDINLNKYSNVT+GK+YSE+L+KER+GEYLGATVQ++PHVT+ 
Sbjct: 63 TDDGAETDLDLGHYERFIDINLNKYSNVTSGKAreSEILRKERKGEYLGATVQ^WPHVT^M 122 

Query: 122 LKEKIKRJUVTTTDSDVIITEVGGTVGDIESLPFLEALRQMiaDVGSDNVMYIHTTLLPYL 181 

LKEKIKSAATTTD+D+IITEVGGTVGD+ESLPF+EaLRQMKA+VG+DNVMYIHT + +L 
Sbjct: 123 LKEKIKRRATTTDaDIIITEVGGTVGDMESLPFIEALRQMKREVGaDlTOiyiHTVPILH^ 182 

Query: 182 KaAGEMKTKPTQHSVKELRGLGIQENMLVIRTEQPAGQSIKNKLAQFCDVAPEflVIESUJ 241 

+AAGE+KTK Q++ K LR GIQ NMLV+R+E P +++K+A FCDVAPEAVI+SLD 
Sbjct: 183 RAWSELKTKIAQiaTKTIJffiYGIQKtMLVLRSEVPITTENffiDKIM^ 242 

Query: 242 VDHIYQIPLl^IQAQIsMDQIVCDHLKLETPAADMTEWSAMVDKOTINLEKKOTCIALVGKYVE 301 

V+H+YQIPLN+QAQNMDQIVCDHLKL+ P ADM EWSAMVD VMNL+KKVKIALVGKYVE 
Sbjct: 243 VEHLYQIPLNLQAQNMDQIVCBHLKLnRPKADMAEWSflMVDHVl^ 302 

Query: 302 LPDAYLSVVEALKHSGYVNDVAIDLKWVNRAEVTEDNIKELVGDADGIIVPGGFGQRGSE 361 

LPDAY+SV EALKH+GY +D +D+ WVNA +VT++N+ ELVGDA GIIVPGGFGQRG+E 
Sbjct: 303 LPDAYISVTEALKHAGYASDAEVDINWVNANDVTDENVAELVGDAAGIIVPGGFGQRGTE 362 

Query: 362 GKIEAIRYAREJTOVPMLGVCLGMQLTCVEFARNVIiraiHGayKrSA^ 421 

GKI AI+YARENDVPMLG+CLGMQLT VEEARNVL L GA+S ELDP+T +P+IDIMRDQ 
Sbjct: 363 GKIAAIKYARENDVPMLGICLGMQLTAVEFARNVLGIiEGAHSFELDPETKyPVIDIMRDQ 422 

Query: 422 IDIEDMGGTLRLGLYPCKLKSGSRAAAAYNNQEWQRRHRHRYEFNTKFREQFEAAGFVF 481 

+D+EDMGGTLRLGLYP. KLK+GSRA AAYN+ EWQRRHRHRYEFN K+RE FE AGFVF 
Sbjct: 423 VDVEDMGGTLRLGLYPAKLKNGSRAKaAYNDAEWQRRHRmYEFNNKY^ 482 

Query: 482 SGVSPDHRLMEWELPEKKFFVAAQYHPELQSRPNHREELYTAFVTAAVENMK ,534 

SGVSPDNRL+E+VEL KKFFVA QYHPELQSRPN EELYT F+ AVEN K 
Sbjctv 483 SGVSPDNRLVEIWLSGKKFFVACQYHPELQSRPNRPEELYTEPIRVA.VENSK 535 

A related DNA sequence was identified in S.pyogenes <SEQ ID 7045> which encodes the amino acid 

sequence <SEQ ID 7046>. Analysis of this protein sequence reveals the following: 

Possible site: 23 
>» Seems to have an ■unoleavable H-term signal seq 

INTEGRAL Likelihood = -0.11 Transmembrane 5 - 21 ( 5 - 21) 



Final Results 



bacterial membrane 

bacterial outside 
bacterial cytoplasm 



-- CertaintYi=0 . 1044 (Affirmative) < suco 

-- Certainty=0. 0000 (Not Clear) < suco 
-- Certainty=0, 0000 (Not Clear) < suco 



Final Results 



bacterial membrane Certainty=0. 1044 (Affirmative) < suco 
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bacterial outside Certaint-y=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintyi=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the databases: 

5 >GP:CaA09021 GB:AJ010153 CTP synthetase [Lactococcus lactis subsp. 

cremoris] (ver 2) 
Identities = 423/532 (79%) , Positives = 483/532 (90%) 

Query: 2 TKYIFVTGGWSSIGKGIVAASLGRLLKNRGLKVTIQKFDPYINIDPGTMSPYQHGEVYV 61 
10 TKyiFVTGG SS+GKBIVAASLGRLLKNRGLKVT+QKFDPY+NIDPGTMSPyQHGEV+V 

Sbjct: 3 TKYIFVTGGGTSSMGKBIVaASIXSRLLKNRGLKOTWKPDPYIiNIDPGTMSPyQHGEVFV 62 

Query: 62 TDDGRETDLDLGHYERFIDINIilKySNVTTGKIYSEVLRKERKGEYLGRTVQVIPHITDA 121 
TDDGaETDLDLGHYERFIDINIiNKySNVT+GK+YSE+LRKERKGEYLGAWQ++PH+T^ 
15 Sbjct: 63 TDDGAETDIiDLGHYERFIDiraJJKySim'SGK\nrSEILRKERKGEYIiGATV^^ 122 

Query: 122 LKEKIKRAASTTDSDVIITEVGGTVGDIESLPFLE?U^QMKZiIM3SENVlWlinT^ 181 

LKEKIKRAA+TTD+D+IITEVGGTVGD+ESLPF+EALRQMKA+VG++NVMYIHT + +L 
Sbjct: 123 LKEKIKRAATTTDADIIITEVGGTVGDMESLPFIEAIjRQMKABVGADNVMYIHTTO 182 

20 

Query: 182 KAAGEMKTKPTQHSVKELRGLGIQPNMLVIRTEEPVEQGIKNKLAQFCDVNSEAVIESRD 241 

+AAGE+KTK Q++ K LR GIQ NMLV+R+E P+ +++K+A FCDV EAVI+S D 
Sbjct: 183 RAAGELKTKIAQNATKTLREYGIQANMLVLRSEVPITTEMRDKIAMFCDVaPEAVIQSLD 242 

25 Query: 242 VEHLYQIPiaiLQfiQSMDQIVCDHriKIiNAPQaDMTEWSAMVDKV^^ 301 

VEHLYQIPUSILQaQ+MDQIVCDHLKL+AP+ADM EWSAMVD VMNL+K KIALVGKYVE 
Sbjct: 243 VEHLYQIPI*aQAQNMDQIVCDHLKLnaPKftDMAEWSftMVDHVmDKK^ 302 



30 



Query: 302 LPraYLSVVEALKHSGYANDTMDLKWVNAlTOVTVDI^^ 361 

LPDAY+SV EALKH+GYA+D +D+ WVNANDVT +N A+L+GDA GIIVPGGFGQRGTE 
Sbjct: 303 LPDAYISOTEALKHAGYASDAEVDIlSmVNftNDVTDENVAELVGDAafillVE^ 362 



Query: 362 6KIQAIRY2a?ENDVPMLGICLGMQLTC\raFARHVIJilMEGRNSFEIjEPSTKYPIXDlI(^^ 421 
35 GKI AI+YARENDVPMLGICLGMQLT VEFAR+VL +EGa+SFEL+P TKYP+IDIMRDQ 

Sbjct: 363 GKIAAIKYARENDVPIffifilCI^QLTAVEFARimjGIiEGaHSFEI^PETKYPVIDIM^ 422 

Query: 422 IDIEDMGGTLRLGLYPCKLKPGSKAAMAYIJNQEWQRRHRHRYEFNNKFRPEFEAAGFVF 481 
+D+EDMGGTLRLGLYP KLK 6S+A AYN+ EWQRRHRHRYEFNNK+R +FE AGFVF 
40 Sbjct: 423 VDVEDMGGTLRICLYPAKLKNGSRAKftAYNDAEWQRRHRHRYEFlSn^^ 482 

Query: 482 SGVSPDNRLVEIVELKEKKPFWAAQYHPELQSRPiSIRPEEriYTAFVTAAIKNS 533 

SGVSPDNRLVEIVEL KKFFVA QYHPELQSRPNRPEELYT F+ A++NS 
Sbjct: 483 SGVSPDNRLVEIVELSGKKFFVACQYHPELQSRPNRPEELYTEFIRVAVENS 534 

45 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 477/532 (89%) , Positives = 503/532 (93%) 

Query: 1 MTKYIFVTGGWSSIGKGIVAASLGRLLKNRGLKVTIQKFDPYINIDPGTMSPYQHGEVY 60 
50 MTKYIFVTGGWSSIGKGIVAASLGRLLKNRGLKVTIQKFDPYINIDPGTMSPYQHGEVY 

Sbjct: 1 MTKYIFVTGGWSSIQKGlVAASLGRLLKmGLKVTIQKFDPYIHIDPGTMSPYQHGEVY 60 

Query: 61 VTDIX3aETDLDIX3HYERFIDI]SriaJKYSNVTTGKIYSBVLKKERRGEYI^ 120 
VTDDGMTDLDLGHYERFIDINIiNKYSlOTTGKIYSEVL+KER+GEYLGATVQVI 
55 Sbjct: 61 VTDDGRETDLDI^HYERPIDiraiJKYSNVTTGKIYSEVIJlKERKBK^^ 120 

Query: 121 ALKEKIKRAATTTDSDVIITEVGGTVGDIESLPFLEALRQMKADVGSDNVMYIHTTLLPY 180 

ALKEKIKRAA+TTDSDVIITEVGGTVGDIESLPFLEALRQMKADVGS+NVMYIHTTLLPY 
Sbjct: 121 ALKEKIKRAASTTDSDVIITEVGGOTGDIESLPFLEftliRQMKADVGSENVMyiHTTLLPY 180 

60 

Query: 181 LKAAGEMKTKPTQHSVKEIJlGLGIQPNMLVIRTEQPAGQSIKlSrKLAQFCDVAPEAVIESL 240 

LKRAGEMKTKPTQHSVKELRGLGIQENMI.VIRTE+P Q IKNKLAQFCDV EAVIES 
Sbjct: 181 LKMGEMKTKPTQHSVKELRGLGIQPNMLVIRTEEPV^iQGIKimiAQFCDVNSEAVIESR 240 

65 Query: 241 DVDHIYQXPJMVIQAQNI^QIVCDHLKLETPAADm'EWSAiyrraKVrmEKKVK 300 

DV+H+YQIPLM+QAQ+MDQIVCDHLKL P ADMTEWSAMVDKVMNL K KIALVGKYV 
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Sbjct: 241 DVEHLYQIPLNLQAQSMDQIVCDHLIOMAPQADMTEVISAMVDK^iaaRKTTKIALVGKyV 300 

Query: 301 ELPDAYLSVVEALKHSGYVNDVAIDLKWVNAAEWEDNIKELVGrffiDGIIVPGGFG^^^ 360 

ELPDAYLSWEMiKHSGY ND AIDLKWVNA +VT DN +L+GDADGIIVPGGFGQRG+ 
Sbjct: 301 ELPDAYLSVVEALKHSGYAlTOTAIDLKm/NairoVTVDNAADLLGDADGIIVPGGFGQRGT 360 ■ 

Query: 361 EGKIEAIRYARENDVPMLGVCLGMQLTCVEFARHVLNLHGANSAELDPKTPFPIIDIMRD 420 

EGKI+AIRYARENDVPMLG+CLGMQLTCVEFAR+VI1N+ GANS EL+P T +PIIDIMRD 
Sbjct: 361 EGKIQAIRYAREISroVPMLGICL(a4QLTCTOFARHVi™EGRNSFELEPSTK:YPIIDIMRD 420 

Query: 421 QIDIEDMGGTLRLGLYPCKLKSGSRAAAAYNNQEWQRRHRHRYEFNTKFREQFEAAGFV 480 

QIDIEDMGGTLRIiGLYPCKLK GS+AA AYWNQEWQRRHRHRYEFN KFR +FEAAGFV 
Sbjct: 421 QIDIEDMGGTLRLGLYPCKliKPGSKAAMAYimQEWQRRHRHRYEBTSINKFRPEFEAAGFV 480 

15 Query: 481 FSGVSPDNRLMEWELPEKKFFVRAQYHPELQSRPNUAEEIiYTAFVTAAVEN 532 

FSGVSPDNRL+E+VEL EKKFFVAAQYHPELQSRPN EELYTAFVTAA++N 
Sbjct: 481 FSGVSPDNRLVEIVEIiKEKKFFVAAQYHPELQSRPNRPEELYTAFVTAAIKN 532 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
20 vaccines or diagnostics. 

Example 2285 

A DNA sequence (GBSx2419) was identified in S.agalactiae <SEQ ID 7047> which encodes the amino 
acid sequence <SEQ ID 7048>. Analysis of this protein sequence reveals the following: 

Possible site: 34 

25 »> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -9.92 Transmembrane 13 - 29 ( , 3 - 34) 

Final Results 

bacterial membrane Certainty=0. 4970 (Affirmative) < suco 

30 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9285> which encodes amino acid sequence <SEQ ID 9286> 
was also identified. 

35 The protein has homology with the following sequences in the GENPEPT database. 

>GP:CaB14296 GB:Z99116 yqkD [Bacillus subtilis] 
Identities = 79/289 (27%) , Positives = 139/289 (47%) , Gaps = 8/289 (2%) 

Query: 1 MKKIRLSKFIKMIWILFLISVAASFYFFHVAQVRDDKSFISNGQRKPGNSLYAYDKSFD 60 
40 MKKI L+ I +V + I + S + + D+ I + G+ ++ +SF+ 

Sbjct: 1 MKKILIjA--IGaLVTAVIAlGIVFSHMILFIKKKTDED--lIKRETDNGHDVF ESFE 53 

Query: 61 KIjLKQKIEMTNGNIKjQ!VAWYVPAVKKTHKTAVVVH6FANSKENMKAYGWLFHI^ 120 
++K ++ +YA TT++HG + N YLF LG+NVL+ 

45 Sbjct: 54 Q^IEKTAFVIPSAYGYDIKGYHVaPHDTEmIIICH6VTMNVIJSISLKyMHLFI£)I^ 113 

Query: 121 PDNIAHGESHGQLIGYGIOTIRENIIKWTEMIVDK-NPSSQITLFGVSMGGATVMMASGEK 179 

D+ HG+S G+ YG+ +++++ K ++ +K N I + G SMG T ++ +G 
Sbjct: 114 YDHRRHGQSGGKTTSYGFYEKDDIJSKOTSLLKNKnraRGLIGIHGESMGAVTALLYAGAH 173 

50 

Query: 180 LPSQVWIIEDCGYSSVWDELKFQRKEMYGLPAFPLLYEVSTISKIRAGFSYGQASSVEQ 239 

I DC ++ ++L ++ + Y LP++PLL K+R G+ + S + 

Sbjct: 174 CSDGADFYIADCPFACFDEQLAYRLRAEYRLPSWPLLPIADFFLKLRGGYRAREVSPLAV 233 

55 Query: 240 LKKNNLPALFIHGDKDNFVPTSMVYDNYKATAGKKELYIVKGAKHAKSF 288 

+ K P LFIH D+++P S Y+ G K LYI + +HA S+ 

Sbjct: 234 IDKIEKPVLFIHSKDDDYIPVSSTERLYEKKRGPKALYIAENGEHAMSY 282 
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A related DNA sequence was identified in S.pyogenes <SEQ ID 7049> which encodes the amino acid 
sequence <SEQ ID 7050>. Analysis of this protein sequence reveals the following: 

Possible site: 24 
>>> Seems to have an uncleavable N-term signal seq 
5 INTEGRAL Likelihood = -7.48 Transmembrane 10 - 26 ( 3 - 32) 

Final Resxilts 

bacterial membrane Certainty=0. 3994 (Affirmative) < suco 

bacterial outside Certaintya=o . 0000 (Not Clear) < suco 

10 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the databases: 

>GP:CaB14296 GB:Z99116 yqkD [Bacillus subtilis] 
Identities = 88/295 (29%) , Positives = 145/295 (48%) , Gaps = 4/295 (1%) 

Query: 10 LGILFLLITLISVGaSFYFFHTOQIREEKSFINNKKRSTNNPLyPAEQSFDALPYEKRQL 69 

L 1 L+ +1++G F, H+ ++K+ + KR T+N + +SF+ + + 
Sbjct: 6 LAIGALVTAVIAIG--IVFSHMILFIKKKTDEDIIKRETDNG-HDVFESFEQMEKTAFVI 62 

20 Query: 70 TWRGLKQVGWYLPAAQKTm'AIVVJIGFTmKEDMKPYAMLFHDLGYJJ^ 129 

+ +YA TTH-HGT+ + YLF DLG+NVL+ D+ H6+S 

Sbjct: 63 PSAyGYDIKGYHVAPHDTE^ITIIICHGVTMIW]aISLKXmLPLDI/3WNVLIYDH^^ 122 

Query: 130 EGNLIGYGW]roRLNVMAWrDQLI-KENPESQITLFGLSMGAAT\MvlASGE^ 188 
25 G YG+ ++ ++ L K N I + G SMGA T ++ +G I 

Sbjct: 123 GGKTTSYGFYEKDDimWSIiLKNKTimRGLIGIHGESMGAVTALLYAGAHCSDGADFYI 182 

Query: 189 EDCGYASVWDELKFQAKRMYimPAFPLLYEVSALSKIRAGFSYGEASSVKQLaKNKRP 248 
DC +A ++L ++ +A Y riP++PLL K+R G+ E S + + K ++P L 

30 Sbjct: 183 ADCPFACFDEQLAYRLRAEYRLPSWPLLPIADFFLKLRGGYRftREVSPLAVIDKIEKPVL 242 

Query: 249 FIHGDKDDFVPTKMVYDMKATKGPKElLIVKGRKHRKSPETNPEQYQKKIAAFIi 303 

FIH DD++P y+ +GPK + I + +HA S+ N Y+K + FL 

Sbjct: 243 FIHSKDDDYIPVSSTERLYEKKRGPKALYIAENGEHAMSYTKNRHTYRKTVQEFL 297 

35 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 203/294 (69%) , Positives = 246/294 (83%) 

Query: 1 MKKIRLSKFIKMIWXLFLISVAASFYFFHVAQVRDDKSFISNGQRKPGNSLYAYDKSFD 60 
40 MK 1R++K++ ++ +++ LISV ASFYFFHVAQ+R++KSFX+N +R N LY ++SFD 

Sbjct: 1 MKTIRIAKXLGILFLIiITLISVGASFYFFHVAQIREEKSFINNKKRSTNNPLYPAEQSFD 60 

Query: 61 KLLKQKIEMTNGNIKQVAlimrt>AVKKTHKTAVVVHGPANSKENl^ 120 

L +K ++TN+ +KQV l-JY+PA +KT KTA+WHGF N KE+MK Y LFH LGYNVLM 
45 Sbjct: 61 ALPYEKRQLTNRGLKQVGWYLPAAQKTKKTAIWHGFTNDKEDMKPYAMLFHDLGYWVLM 120 

Query: 121 PDNIAHGESHGQIiIGYGWmJRENIIKWTEMIVDKNPSSQITLFGVSMGGATVMMASGEKL 180 

PDN AHGES G LIGYGWNDR N++ WT+ ++ +NP SQITLFG+SMG ATVMMASGE+L 
Sbjct: 121 PDNEAHGESEGMblGYGWiroRIiNVMAm'DQLiKENPESQITLFGLSMGaA 180 

50 

Query: 181 PSQWNIIEDCGYSSVWDELKFQAKEMYGLPAPPLLYEVSTISKIRAGFSYGQASSVEQL 240 

P+QV ++IEDCGY+SVWDELKFQAK MY LPAFPLLYEVS +SKIRAGFSYG+ASSV+QL 
Sbjct: 181 PAQVTSLIEDCGYASVWDELKFQZiKJiMYiaPAPPLLYEVSaLSKIRAGPSYGEASSV^ 240 

55 Query: 241 KKNNLPALFIHGDKDNFVPTSMVYDNYKATAGKKELYIVKGAKHAKSFETEPEK 294 

KN P LFIHGDKD+FVPT MVYDNYKAT G KE+ IVKGAKHAKSFET PE+ 
Sbjct: 241 AKNKRPTLFIHGDKDDFVPTKMVYDNYKATKGPKEILIVKGAKHAKSFETNPEQ 294 

SEQ ID 9286 (GBS662) was expressed in E.coli as a GST-fusion product. SDS-PAGE analysis of total cell 
60 extract is shown in Figure 136 (lane 8-10; MW 63kDa) and in Figure 187 (lane 4; MW 63kDa). 



GBS662-GST was purified as shown in Figure 237, lane 7. 
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Based on this analysis, it was predicted that these proteins and their epitopes could be usefhl antigens for 
vaccines or diagnostics. 

Example 2286 . 

A DNA sequence (GBSx2420) was identified in S.agalacttae <SEQ YD 705 1> which encodes the amino 
5 acid sequence <SEQ ID 7052>. This protein is predicted to be aspartate— ammonia ligase (asnA). Analysis 
of this protein sequence reveals the following: 

Possible site: 60 

>» Seems to have no N-tertoinal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0 . 2898 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certaxntyi=0. 0000 (Not Clear) < suco 

15 A related GBS nucleic acid sequence <SEQ ID 9309> which encodes amino acid sequence <SEQ ID 9310> 
was also identified. 

The protein has homology with the following sequences in the GEMPEPT database. 



20 



>GP:AAC22222 GB:U32738 aspartate- -ammonia ligase (asnA) [Haemophilus influenzae Rd] 
Identities = 246/300 (82%) , Positives = 268/300 (89%) 

Query: 1 MIDKLEIVEVQGPILSQVGDGMQDNLSGIEHPVSVKVLNIPEftEFEVVHSLAKWKRHTLA. 50 

+I++L I+EVQGPILSQVG+GMQDNLSGIE V V V IP A FEWHSLAKWKRHTLA 
Sbjct: 23 MEQlfiIIEVQGPILSQVGNGMQDl«:iSGIEKAVQVlTOCCIPmVFE\AmSLAEWKRHTLA 82 

25 Query: 61 RFGFNEGEGLFVHMKALRPDEDSLDPTHSVYVDQmWEKVIPIXSRRNLDYLKETVEKIYK 120 

RF F E EGLFVHMKaLRPDEDSIiDPTHSVYVDQWDWEKVIP+GRRN YLKETV iy+ 
Sbjct: 83 RFNFKEDEGIiFVHMKaiiRPDEDSLDPTHSVYVDQWDWEKVIPEGRRNFAYLKETVNSIYR 142 

Queary: 121 AIRLTELAVEARFDIESILPKRITFIHTEELVEKYPDLSPKERENAIAKEYGAVFLIGIG 180 
30 AIRLTELiAVEARFDI SILPK+ITF+H+E+LV++YPDLS KERENAI KEYGAVFLIGIG 

Sbjct: 143 AIRLTELAVEaRFDIPSILPRQITFVHSEDLVKRYPDIjSSKEREN&ICKEYGAVFLIGIG 202 

Query: 181 GEIiftDGKPHDGRAPDYDDWTTPSENGFRGLNGDILVWiraQIfiTAFELSSMGIRVDEnftLK 240 
G+L+DGKPHDGRAPDYDDWTT SENG+KGIiNGDILVWN+QLG AFELSSMGIRVDE AL+ 
35 Sbjct: 203 GKLSDGKPHDGRAPDYDDWTTESENGYKGIJJGDILVWNDQLGimFELSSMGIRVDESALR 262 

Query: 241 RQWLTGDEDRLEFEWHKTLLRGFFPLTIGGGIGQSRLAMFLLRKXHIGEVQSSVWPKEV 300 

QV LTGDED L+ +WH+ LL G PLTIGGGIGQSRLAM LLRK HIGEVQSSVWPKE+ 
Sbjct: 263 IiQVGLTGDEDHLK^roWHQDIlLNGKIlPLTIGGGIGQSRIJWILLIJRKKHIGEVQSSVWPKE^ 322 

40 

A related DNA sequence was identified in S.pyogenes <SEQ ID 7053> which encodes the amino acid 
sequence <SEQ ID 7054>. Analysis of this protein sequence reveals the following: 

Possible site: 34 
>» Seems to have no N-terminal signal sequence 
45 INTEGRAL Likelihood = -0.16 Transmembrane 189 - 205 ( 189 - 205) 

Final Results 

bacterial membrane Certainty=0 . 1065 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 
50 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the databases: 

>GP:AAC22222 GB:032738 aspartate- -ammonia ligase (asnA) [Haemophilus influenzae Rd] 
Identities = 255/330 (77%) , Positives = 289/330 (87%) 

55 

Query: 1 MKKSFIHQQEEISFVKNTFTQYLIAKLDVVEVQGPILSRVGDGMQDNLSGTENPVSVNVL 60 
MKK+FI QQ+EISFVKNTFTQ LI +L ++EVQGPILS+VG+GMQDNLSG E V VNV 
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Sbjct: 1 MKKTFILQQQEISFVKNTFTQNLIEQLGIIEVQGPILSQV6NGMQDNLSGIEKA.VQV3SIVK 60 

Query: 61 KIPNATFEVVHSIJyWKRHTriaRFGElffiGEGLVVNMKM,RPDEDSL^ 120 

IPNA PEWHSIiRRHKRHTLRRF F E EGL V+MKALRPDEDSLD THSVYVDQWDWE 
Sbjct: 61 CIjmWEVVHSLAKWKRHTI^ENFKEDEGLPVHMKaLRPDEDSIiDPTHSVYVDQW 120 

Query: 121 roriPDGKRTOAYLKETVETIYKVIRLTELAVEARYDIEAVLPKKITFIHTEELVAKYPDL 180 

KVIP+G+RN AYLKETV +IY+ IRLTELAVEAR+DI ++LPK+ITF+H+E+LV +YPDL 
Sbjct: 121 mPEGRRNFAYLKETVNSIYRAIRLTELAVEftRFDlPSILPKQITPVHSEDLVKRYPDL 180 

Query: 181 TPKERENAITKEFGAVFLIGIGGVLPDGKPHDGRAPDYDDWTTETENGYHGLNGDILVWN 240 

+ KERENAI KE+GAVFLIGIGG L DGKPHDGRAPDYDDWTTE+ENGY GLNGDILVWN 
Sbjct: 181 SSKEEENAICKEYGAVFLIGlGGKLSIXSKPHDGRAPimJDWrTESENGyKGIiNGDILV^ 240 

Query: 241 DQLGSAFELSSMGIRVDEEALKRQVEMTGDQDRLGFDWHKSLrjNGLFPIiTIGGGIGQSRM 300 

DQLG AFELSSMGIRVDE AL+ QV +TGD+D L DWH+ LLNG PLTIGGGIGQSR+ 
Sbjct: 241 DQLGKAFELSSMGIRVDESALRLQVGLTGDEDHLKMDWHQDLLNGKIiPLTIGGGIGQSRL 300 

Query: 301 VMFLLRKQHIGEVQTSVWPQEVRDSYDNIL 330 

M LLRK+HIGEVQ+SVWP+E+ + + NIL 
Sbjct: 301 AMLLLRKKHIGEVQSSVWPKEMLEEFSNIL 330 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 254/303 (83%) , Positives = 280/303 (91%) 



Query: 


1 


MIDKLEIVEVQGPILSQVGDGMQDNLSGIEHPVSVKVIJSriPEaEFEVVHSLAKWKRHTLA 


60 






+1 ' KL++VEVQGPILS+VGDGMQDNLSG E+PVSV VL IP A FEWHSLAKWKRHTLA 




Sb j Ct : 


23 


LIAKLDWEVQGPILSRVGDGMQDNLSGTENPVSVNVLKIPNATFEWHSLRKWKRHTLA 


82 


Query: 


61 


RFGFNEGEGLFVHMKALRPDEDSLDPTHSVYVDQWDWEKVIPDGRRNLDYLKETVEKIYK 


120 






RFGFNEGEGL V+MKALRPDEDSLD THSVYVDQWDWEKVIPDG+RNL YLKETVE lYK 




Sbjct: 


83 


RFGFNEGEGLVVNMKALRPDEDSLDQTHSVYVDQWDWEKVIPDGKRNLAYLKETV^ 


142 


Query: 


121 


AIRLTEIAVERRFDIESILPKRITFIHTEELVEKYPDLSPKERENAIAKEYGAVFLIGIG 


180 






IRLTELAVEAR+DIE+H-LPK+ITFIHTEELV KYPDL+PKERENAI KE+GAVFLIGIG 




Sbjct: 


143 


VIRLTELAVEARYDIEAVLPKKITFIHTEELVAKYPDLTPKEREN&ITKEFGAVFLIGIG 


202 


Query: 


181 


GELADGKPHDGRAPDYDDWTTPSENGFKGUSTGDILVWMEQLGTAFELSSMGIRVDEDALK 


240 






G L DGKPHDGRAPDYDDWTT +ENG+ GLNGDILVWN+QLG+AFELSSMGIRVDE+ALK 




Sbjct: 


203 


GVLPDGKPHIXSRAPDYDDmTETENGYHGiaiGDinVWNDQLGSAEBLSSMGIRTO^ 


262 


Query: 


241 


RQWLTGDEDRLEFEWHKTLLRGFFPLTIGGGIGQSRLAMFLLRKXHIGEVQSSVWPKEV 


300 






RQV +TGD+DRL F+WHK+LL G FPLTIGGGIGQSR+ MFLLRK HIGEVQ+SVWP+EV 




Sb j ct : 


263 


RQVEMTGDQDRLGFDWHKSLLNGLFPIiTIGGGIGQSRMVMFLLRKQHIGEVQTSVWPQEV 


322 


Query: 


301 


RDT 303 








RD+ 




Sb j ct : 


323 


RDS 325 





Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vacdnes or diagnostics. 

Example 2287 

A DNA sequence (GBSx2421) was identified in S.agalactiae <SEQ ID 7055> which encodes the amino 
acid sequence <SEQ ID 7056>. Analysis of this protein sequence reveals the following: 

Possible site: 27 

»> Seems to have no N-tertninal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0. 3163 (Affirmative) < suco 

bacterial membrane Certainty=0.0000(Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 
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The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

5 Example 2288 

A DNA sequence (GBSx2422) was identified in S.agalactiae <SEQ ID 7057> which encodes the amino 
acid sequence <SEQ ID 705 8>. Analysis of this protein sequence reveals the following: 
Possible site: 25 

»> Seems to have a cleavable N-term signal seq. 

10 

Final Results 

bacterial outside Certainty=0. 3000 (Affirmative) < suco 

bacterial membrane Certaintys=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintyi=0. 0000 (Not Clear) < suco 

15 

A related GBS nucleic acid sequence <SEQ ID 9007> which encodes amino acid sequence <SEQ ID 9008> 
was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:AftD56628 GB:AF165218 Bta [Streptococcus pneumoniae] 
20 Identities = 30/97 (30%) , Positives = 50/97 (50%) , Gaps = 3/97 (3%) 

Query: 50 KALVSKSQQSEATIFIGRPTCQYCRAFLPKLLKSQATLHSKIYYLDSQKYKG-KRLKSFF 108 

+A + ++ AT FIGR TC YCR F L A + iy+++S++ I1++F 
Sbjct: 18 RAQEftLDKKETATFFIGRKTCPYCRKFAGTLSGVVAETKRHiyFINSEEASQLNDLQAFR 77 

25 

Query: 109 KKHHITTVENLftHYQQGKMTKYLVQGSQATPQQIQTF 145 

++ I TVP H G++ + S + Q+I+ F 
Sbjct: 78 SRyGIPTVPGFVHITDGQIN--VRCDSSMSAQEIKDF 112 

30 SEQ ID 9008 (GBS 134) was expressed in E.coli as a His-flision product. SDS-PAGE analysis of total cell 
extract is shown in Figure 40 (lane 2; MW 17kDa). It was also expressed in E.coli as a GST-fusion 
product SDS-PAGE analysis of total cell extract is shown in Figure 46 (lane 4; MW 42kDa). 

GBS134-GST was purified as shown in Figure 204, lane 10. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
35 vaccines or diagnostics. 

Example 2289 

A DNA sequence (GBSx2423) was identified in S.agalactiae <SEQ ID 7059> which encodes the amino 
acid sequence <SEQ ID 7060>. Analysis of this protein sequence reveals the following: 

Possible site: 58 
40 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0 . 0735 (Af f iinnative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

45 bacterial outside — Certainty=0 . 0000 (Not Clear) < succ> 

A related GBS nucleic acid sequence <SEQ ID 9603> which encodes amino acid sequence <SEQ ID 9604> 
was also identified. 
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The protein has homology with the following sequences in the GENPEPT database. 

>GP:BAB06309 GB-.APOOlBie unknown conseirved protein [Bacillus halodurans] 
Identities = 78/178 (43%) , Positives = 115/178 (63%) , Gaps = 3/178 (1%) 

5 Query: 3 MRVVMTFGGRPLKTLDGKTTRPTTDKVKGAIE^mIGPPFEGGRVIlDLPSGSGSLAIEAI 62 

MRV+AG G LK + G TRPTTDKVK AIENMIGPPF+GG I1DL+ GSG L IEA+ 
Sbjct: 1 MRVIAGEQKGLTLKAVPGHKTRPTTDKOTEAIFSMIGPFFDGGIGLDLYGGSGGLGIES^ 60 

Query: 63 SRGMDQAVLVEKDRRAQWIQENIAMTKSPEQFQLLKMEANRALEQLTGQ---FDLVLLD 119 
10 SRG+++ + V++ +RA I++N++ + ++ + +A RAL+ LT + F V LD 

Sbjct: 61 SRGVERMIFVDQQKRAIETIKQlSnjSHa3LEGRaEVYRm)AKRALQVLTKRGIVFAYVFLD 120 

Query: 120 PPYAKEEIVKQIQIMDSKGLLGDDIMIACETDKSVDLPEEIASFGIWKQKIYGISKVT 177 
PPYAK+ I + 1+ + GLL + ++ CE D+ LP++I K++ YG + +T 

15 Sbjct: 121 PPYAKQTIKNDLAirANHGLLEEGGVWCEHDRDTMLPDQIEYAVKHKEETYGDTMIT 178 

There is also homology to SEQ ID 132. 

Based on this analysis, it was predicted that this protein and its epitopes, could be usefiil antigens for 
vaccines or diagnostics. 

20 Example 2290 

A DNA sequence (GBSx2424) was identified in S.agalactiae <SEQ ID 706 1> which encodes the amino 
acid sequence <SEQ ID 7062>. Analysis of this protein sequence reveals the following: 



25 



30 



50 



Possible site: 14 

>» Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 4984 (Affirmative) < suco 

bacterial membrane — Certaintys=0 . 0000 (Not Clear) < suco 
bacterial outside — Certainty=0. 0000 (Not Clear) < suco 



The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB96619 GB:AJ400630 hypothetical protein [Streptococcus pneumoniae bacteriophage MMl] 
Identities = 175/254 (68%) , Positives = 219/254 (85%) 

35 Query: 2 LRRHIYS^n^EEHXHLQPEIKYHQKTNIRKNmVYTVFIEEKTOVILADLK^^ 61 

L RH+Y ++ EI++HQ++NLRKNRVYTVF +EKV +L+DL lAD+FFG+ETG 

Sbjct: 50 LARHLYESFLHFYEIKSEIRHHQRSNLRKNRVYTVFTDEKVQDLLSDLHLADSFFGLETG 109 

Query: 62 lEHSILDNDENGRAYLRGAFLSTGTVREPDSGKYQIiEIFSVYLDHAQDIJUSILMKKFMIiDA 121 
40 1+ +IL ++E GRAYL GAFLh- G++R+P+SGKYQLEI SVYLDHAQ +A+L+++F+LnA 

Sbjct: 110 IDEAILSDEEaGRAYLCGaPLaNGSIRDPESGKYQI.EISSVYLDHAQGIASLLQQPLI.DR 169 

Query: 122 KVIEHKHGAVTYLQKaEDIiyDFLIVIDAMEARDAFEEIKMIRETRNDINRANNVETANIA 181 
KV+E K GAVTYLQ+AEDIMDFLIVI AM+ARD FE +K++RETRND+NRANN ETANIA 
45 Sbjct: 170 KVLERKKGAVTYLQRaEDimPLIVIGaMQaRDDFERVKILRETRNDIJJRANNaETANIA 229 

Query: 182 RTITASMKTINNIIKIMDTIGFDALPSDLRQVAQVRVaHPDYSIQQIADSLETPLSKSGV 241 

RT++ASMKTINNI KI D +G + LP DL++VAQ+R+ HPDYSIQQ+ADSL TPL+KSGV 
Sbjct: 230 RTVSASMRTIHNISKIKDIMGLENLPVDLQEVAQLRIQHPDYSIQQLRDSLSTPLTKSGV 289 



Query: 242 NHRLRKINKIADEL 255 

NHRIiRKINKIADEL 
Sbjct: 290 NHRLRKINKIADEL 303 



55 There is also homology to SEQ ID 5540: 

Identities = 186/254 (73%) , Positives = 227/254 (89%) 

Query: 2 LRRHIYSMLEEHXHLQPEIKYHQKTNLRKNRVYTVFIEEKVDVIIADLKLADAFFGIETG 61 
+ R+IYS++E+ + PEI+YHQKENLRKNRVYTV++E+ V+ ILADLKLftD+FFG+ETG 
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Sbjct: 


50 


IJUlYIYSLIEDAYVIVPEIRYHQKraLRKimVYTVYVEQGWTII^LKIiApSFFGLETG 


109 


Query: 


62 


lEHSILDlTOENGRAYLRGAFLSTGTVREPDSGKYQLEIFSVYLDHAQDLaNIjMKKFMIJJA 


121 






TTT it. j-Tl fT? J-VT.j-naPT.a. (^j.a-P-j-Pa-QPTTVnT.'PTj.C'^TVTinMZinnTiA TjM+KPMTiDA 
X Jl +JJ +U oi\.T X iJ+^jriJc JJt (j't "rj\-r ir -ro'ciri. I l^Jjii X -r o v i JJiJriH.yi-'-i-UT- J-ii'n^i\.i. I'lXJL/n. 




Sbj ct : 


110 


lEPQVLSDDNAGRSYLKGAFIiAAGSIRDPESGKYQLEIYSVYLDHAQDLAQLMQKFMLDA 


169 


Query: 


122 


iWIEHKHGAVTYLQKAEDIlWFLIVIDAMEARDAFEEIKMII^TRlJro 


181 










Sbj ct : 


170 


KTIEHKSGaVTyLQKaEDI^mFLIIIGaMSCKEDFEAIKIJLREARlroINRft^^ 


229 


Query: 


182 


RTITASMKTINNIIKIMDTIGFDALPSDLRQVAQVRVAHPDYSIQQIM)SLETPLSKSGV 


241 






+TI+ASMKTIlilNIIKIMDTIG ++IiP +3:.+QVAQ+RV HPDYSIQQ+AD+LE P++KSGV 




Sbjct: 


230 


KTISASMKTINNIIKI^IDTI6LESLPIE]XX3VaQLRVKHPDYSIQQVAnRLEFPITKSGV 


289 


Query: 


242 


NHRLRKINKIADEL 255 








HHRIiRKINKIAD+L 




Sbjct: 


290 


NHRLRKINKIADDL 303 





Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
20 vaccines or diagnostics. 

Example 2291 

A DNA sequence (GBSx2425) was identified in S.agalactiae <SEQ ID 7063> which encodes the amino 
acid sequence <SEQ ID 7064>. Analysis of this protein sequence reveals the following: 

Possible site: 14 
25 »> Seems to have no N-terminal signal secpience 

Final Results 

bacterial cytoplasm Certainty=0 , 0297 (Af f irtnative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

30 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
35 vaccines or diagnostics. 

Example 2292 

A DNA sequence (GBSx2428) was identified in S.agalactiae <SEQ ID 7065> which encodes the amino 
acid sequence <SEQ ID 7066>. Analysis of this protein sequence reveals the following: 

Possible site: 31 
40 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm CertaintyisO. 2706 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

45 bacterial outside — CertaintyisO . 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB54571 GB:AJ006393 response regulator [Streptococcus pneumoniae] 
Identities = 139/190 (73%) , Positives = 166/190 (87%) 

50 

Query: 8 IKIVLVDDHEM\mLGLKSFimiQADVE\n:GEASNGLEeiKKALEIiRPDVV^^ 67 

+KX+LVDDHEMVRLGLKS+ +LQ DVEV+GEASNG +GI RLELRPDV+VMD+VMPEM+ 
Sbjct: 1 MKILLVDDHEMVRLGLKSYFDLQDDVEVVGEASNGSQGIDLALELRPDVIVTO 60 
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Query: 68 GVEaTLALLKDWPEAAILVLTSYLDNEKIYPVimGAKGVMLKTSSftAEIJmiR^ 127 

G++ATIiA+LK+WPEA IL++TSYIiDNEKI PV++AGAKGYMLKrSSA E+L+A+ KV+ G 
Sbjct: 61 GIDATIiAILKEWPEAKILIVTSYLDNEKIMPVLDAGAKGVMLKTSSADELLHAVSKVAAG 120 

Query: 128 EQAIElffiVDKKIKAHDKCPAIiHEGLTiUlERDILNLIAKGyDNQRXADELFX 187 

E AIE EV KK++ H LHE LTARERD+L L+ARGY+NQRIAD+LFISLKTVETHV 
Sbjct: 121 ELAIEQEVSKKVEYHRNHMELHEELTARERDVI^LIAKlGyENQRIADDLFISLKT^ 180 

Query: 188 SNILGKLWGS 197 

SNIL KL S 
Sbjct: 181 SNILAKLEVS 190 



There is also high homology to SEQ ID 2996: 

Identities = 158/198 (79%) , Positives = 176/198 (88%) , Gaps = 1/198 (0%) 



Query: 


5 


MDKIKIVLVDDHEMVRLGLKSFLNLQADVEVIGEASNGLEGIKKALELRPDVVVMDLVMP 


64 






M KIK++LVDDHEMVR+GLKSFmLQAD++V+GEASNG EG+ AL L+PDV+VM0LVMP 




Sbjct: 


3 


MSKIKVILVDDHEMVRMGLKSFLNLQftDIDVVGEASNGREGVDLAIiALKPDVLVM^ 62 


Query: 


65 


E^mC3VEATIAIlLKDWPEaAILVLTSYIInNEKIyPVIEAGAKGy^mKTSSaM 


124 






E+ GVEATIi +IiK W EA +LVLTSYIiDNErayPVI+fiGAKGYMIiKTSSAaEIimiRKV 




Sb j ct : 


63 


ELGGVEATLEVLKKMEAKVLVLTSYLDlffiKiyPVIDaGaKBYMLKTSSaflEI]^ 


122 


Query: 


125 


SRGEQAIENEVDKKIKAHDKCPALHEGLTARERDILNLLAKGYDNQRIADELFISLKTVK 


184 






S+GE AIE EVDKKIKAHD+ P LHE IiTARE DlIi+IiLAKGYDKQ lADELFISLKTVK 




Sbjct: 


123 


SKGELAIETEVDKKIKftHDQHPDLHEELTAREYDIIiHIjLAKGYDNQTIADELFISLKTVK 182 


Query: 


185 


THVSNILGKLN-GSRSNS 201 








THVSNIL KL G R+ + 




Sb j ct : 


183 


THVSNIIiAKLEVGDRTQA 200 





Based on this analysis, it was predicted that these proteins and their epitopes coidd be useful antigens for 
vaccines or diagnostics. 

Example 2293 

A DNA sequence (GBSx2429) was identijSed in S.agalactiae <SEQ ID 7067> which encodes the amino 
acid sequence <SEQ ID 7068>. This protein is predicted to be histidine kinase (narQ). Analysis of this 
protein sequence reveals the following: 

Possible site: 56 

>» Seems to have no N-tentiinal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 3944 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 



The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB54570 GB:AJ006393 histidine kinase [Streptococcus pneumoniae] 
Identities = 32/55 (58%) , Positives = 49/55 (88%) 

Query: 1 MIDNGlGFDlTOSVYDLSYGLKNIEDRVEDIAGItLQLLSQPGKGVAMDIRriPI.VNQ 55 

++DNGIGF + S+ DLSYGL+NI++RVED+AG +QLI.+ P +G+A+DIR+PL+++ 
Sbjct: 276 VVDNGIGFQI^SUDDLSYGLRNIKERVEDMAGTOQLLTAPKBGIATOIRIPLLDK 330 

There is also homology to SEQ ID 2992: 

Identities = 44/59 (74%) , Positives = 51/59 (85%) 



Query: 1 MIDNGIGFDMDSVYDLSYGLKlIIEDRVEDLAGNLQLLSQPGK6V2iMDIRLPLVNQSEDK 59 

MID+G+GFDMD V DLSYGLKNIEDRV DLAGNL L+SQ GKGV+MDIRLP+V +D+ 
Sbjct: 276 MIDDGVGFDMXJTODLSYGLKNIEDRVMDLAGNIiHLISQKGKGVSMDIRLPIVKGDDDE 334 
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Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2294 

5 A DNA sequence (GBSx2430) was identified in S.agalactiae <SEQ ID 7069> which encodes the amino 
acid sequence <SEQ ID 7070>. This protein is predicted to be RfbQRS0155-l. Analysis of this protein 
sequence reveals the following: 

Possible site: 41 

»> Seems to have no N-terminal signal sequence 

10 

Final Results 

bacterial cytoplasm Certainty=0. 1120 (Affirmative) < suco 

bacterial membrane — Certainty= 0.0000 (Not Clear) < suco 
bacterial outside Certainty=0. 0000 (Not Clear) < suco 

15 

There is also homology to SEQ ID 7072: 

Identities = 171/172 (99%) , Positives = 172/172 (99%) 

Query: 1 MGQVAVEEKSNEIVAIPQLLRTIDIRKSIVTIDAMGTQTAIVDTIIKGKADYCIAVKGNQ 60 
20 +GQVAVEEKSNEIVAIPQLLRTIDIRKSIVTIDAMGTQTAIVDTIIKGKADYCLAVKGNQ 

Sbjct: 143 LGQVAVEEKSNEIVAIPQLLRTIDIRKSIVTIDAMGTQTAIVDTIIKGKADYCLAVKGNQ 202 

Query: 61 ETLTODIALYFSDVNLLEELQEtlRQiYYQTVEKSRGQIEVREYWSSDIKSJIA^ 120 
ETLYDDIALYFSDVNLLEELQENAQYYQTVEKSRC3QIEVREYWVSSDIKWLCGNHPKWHK 
25 Sbjct: 203 ETLYDDIALYFSDVNLLEELQENAQYYQT^KSRGQIEVREYWVSSDIKWLCONHPKJHIK 262 

Query: 121 LRGIGMTRNTIDKDGQLSQENRYFIFSFKPDVLTFANCVRGHWQIESMHWLL 172 

LRGIGMTRNTIDKDGQLSQENRYFIFSFKPDVLTFANCVRGHWQIESMHWLL 
Sbjct: 263 LRGIGMTRNTIDKDGQLSQENRYFIFSPKPDVLTPANCVRGHWQIESMHWLL 314 

30 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2295 

A DNA sequence (GBSx2431) was identified in S.agalactiae <SEQ ID 7073> which encodes the amino 
35 acid sequence <SEQ ID 7074>. This protein is predicted to be translation initiation factor if-3 homolog dsg 
(infC). Analysis of this protein sequence reveals the following: 

Possible site: 42 

>» Seems to have no N-terrainal signal sequence . 

40 Final Results 

bacterial cytoplasm Certainty=0 . 1787 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside CertaintY=0 . 0000 (Not Clear) < suco 

45 The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAA58920 GB:Y0764Q translation initiation factor, IF3 [Listeria monocytogenes] 
Identities = 112/169 (66%) , Positives = 134/169 (79%) 

Query: 7 KDLFINDEIRVRE\mLVGLEGEQIjGIKPLSEaQAIADDANVDLVLIQP(3ATPPVAKIMDY 66 

50 KD+ +ND IR REVRL+ +GEQLG+K +A IA+ AN+DLVL+ P A PPVA+IMDY 

Sbjct: 3 KDMLVNDGIRAREVRLIDQDGEQLGVKSKIDALQIAEKANLDLVLVAPTAKPPVARIMDY 62 



Query: 67 GKPKFEYQKKQKEQRKKQSVVTVKEVELSPVIDKGDFETKLRNGRKFI.EKGNKVKVSIRF 126 
GKF+FE QKK KE RK Q V+ +KEVRLSP ID+ DF+TKLRN RKFLEKG+IOTK SIRF 
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Sbjct: 53 GKFRFEQQKKDKEARKNQKVIVMKEVRLSPTIDEHDFDTKLRNaRKFLEKGDKVKCSIRF 122 

Query: 127 KGRMITHKEIGAKVLAEFAEATQDIAIIEQRAKMDGRQMFMQLAPIPDK 175 
KGR ITHKEIG KVL FA+A +D+ lEQR KMDGR MF+ LAP+ +K 
5 Sbjct: 123 KGRAITHKEIGQKVLDRFAKACEDLCriEQRPKMDGRSMFLVIAPLHEK 171 

A related DNA sequence was identified in S.pyogenes <SEQ ID 7075> which encodes the amino acid 
sequence <SEQ ID 7076>. Analysis of this protein sequence reveals the folloAsdng: 

Possible site: 42 
10 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0 . 2247 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

15 bacterial outside — Certainty=o. 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 167/176 (94%), Positives = 173/176 .(97%) 

20 Query: 1 MKIIAKKDLFINDEIRVREVRLVGLEGEQLGIKPLSEAQAIADDANVDLVLIQPQATPPV 60 

+KXXAKKDLFINDEIRVREVRLVGI.EGEQLGIKPLSEAQ++AD +NVDLVIiIQPQft. PPV 
Sbjct: 1 VKIIAKKDLFINDEIRVREVRLVGLEGEQIfilKPLSEAQSLRDASNVDLVLIQPQAVPPV 60 

Query: 61 AKIMDYGKFKPEYQKKQKEQRKKQSVVTVKEVRLSPVIDKGDFETKLRNGRKFLERGNKV 120 
25 AK+MDYGKFKFBYQKKQKEQRKKQSVVTVKEVRLSPVIDKBDFETKliRNGRKFLEKGNK^ 

Sbjct: 61 AKIJ4DYGKFKFEYQKKQKEQRKKQSVVT\nCETOLSl?VIDKaDPETKIiRNG 120 

Query: 121 KVSIRFKGRMITHKEIGAKVLAEFAEATQDIAIIEQRAKMDGRQMFMQLAPIPDKK 176 
KVSIRFKGRMITHKEIGAKVLA+FAEATQDIAI lEQRAKMDGRQMFMQLAPI DKK 
30 Sbjct: 121 KVSIRFKGRMITHKEIGAKVLADFAEATQDIAIIEQRAKMDGRQMFMQIiAPISDKK 176 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2296 

35 A DNA sequence (GBSx2432) was identified in S.agalacttae <SEQ ID 7077> which encodes the amino 
acid sequence <SEQ ID 7078>. Analysis of this protein sequence reveials the following: 

Possible site: 57 

>» Seems to have no N-terminal signal sequence 

40 Final Results 

bacterial cytoplasm Certainty=0. 1807 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty>=0. 0000 (Not Clear) < suco 

45 The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAC45308 GB:U81957 RNA polymerase beta' subunit [Streptococcus gordonii] 
Identities = 262/286 (91%) , Positives = 276/286 (95%) 

Query: 1 ^1ARKWKAGVEEVXIRSVFTC1T^RHGVCHHCYGINIATGDAVEVGEKVGTI3^ 60 
50 MA +W AGV EV IRSV TCNTRHGVCRHCYGINLATGDAVEVGEAVGTIAAQSIGEPG 

Sbjct: 122 MARQWNAGVTEVTIRSVLTCNTRHGVCRHCYGINLATGDAVEVGEAVGTIAAQSIGEPG 181 

Query: 61 TQLTMRTFHTGGVASNTDITQGLPRIQEIFEARNPKGEAVITEVKGEWAIEEDSSTRTK 120 
TQLTMRTFHTGGVAS++D1TQGLPR+QEIFEARNP1«3EAVITEVKGEV AIEED+STRTK 
55 Sbjct: 182 TQLTMRTFHTGGVRSSSDITQGLPRVQEIFEARNPKGEAVITEVKGEVTAIEEDASTRTK 241 

Query: 121 KVFVKGOTGEGEYVVPFTARMKVEVGDEVARGAALTEGSIQPKIUiEVRDTLSVETYLI^ 180 

KVFVKGQTGEGEYWPFTARMKVEVGD+V+RGAALTEGSIQPK LL VRD LSVETYLLA 
Sbjct: 242 KVFVKGQTGEGEYWPFTARMKVEVGDQVSRGAaLTEGSIQPKHIiLAVRDVr.SVETYLLA 301 
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■ CJuery: 181 EVQKVYRSQGVEIGDKHVEVMVRQMLRKVRVMDPGDTDLLPGTLMDXSDFTDANKDIVXS 240 
EVQKVyRSQGVEIGDKH+EVMVRQM+RKVRVMDPGDTDLIi GTLMDI+DFTDAN+D+VIS 
Sbjct: 302 EVQKVTOSQGVEIGDKHIEVMVRQMIRKVRVMDPGDTDLIlMGTL^C)ITDFTDR^ 361 

5 

Query: 241 GGIPATSRPVLMGITKASLETNSFLSAASFQETTRVLTDAAIRGKK 286 

GG+PAT+RPVLMGITKASLETMSFLSAASFQETTRVLTDaaiRGKK 
Sbjct: 362 GGVPATARPVLMGITKASLETNSFLSAASFQETTRVLTDAAIRGKK 407 

1 0 There is also homology to SEQ ID 384. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2297 

A DNA sequence (GBSx2434) was identified in S.agalactiae <SEQ ID 7079> which encodes the amino 
1 5 acid sequence <SEQ ID 7080>. Analysis of this protein sequence reveals the following: 

Possible site: 24 

»> Seems to have no N-terminal signal sequence 

Final Results 

20 bacterial cytoplasm — Certainty=0. 0352 (Affirmative) < suco 

bacterial membrane — Certainty4=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 

25 No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2298 

A DNA sequence (GBSx2435) was identified in S.agalactiae <SEQ ID 7081> which encodes the amino 
30 acid sequence <SEQ ID 7082>. This protein is predicted to be acetoin dehydrogenase (TPP-dependent) 
beta chain (pdhB). Analysis of this protein sequence reveals the following: 

Possible site: 47 

»> Seems to have no N-terminal signal sequence 

35 Final Results 

bacterial cytoplasm — Certaintyi=0. 0266 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

40 The protein has homology with the following sequences in the GENPEPT database. 

>GP:BAB04496 GB:AP001509 acetoin dehydrogenase (TPP-dependent) beta 
chain [Bacillus halodurans] 
Identities = 37/57 (64%) , Positives = 50/57 (86%) 

45 Query: 1 MLEEFGAKRVRDTPISEAAIAGSAIGAAQTGLRPIVDLTFMDFVTIAMDAIVDDCIR 57 

M+EEFG++RVR+TPISEaAI+G+AIGAA TG+RPI++L F DF+TIAMD +V+ + 
Sbjct: 44 MIEEFGSERVRNTPISEAAISGTAIGRALTGMRPILELQFSDPITIAMDNMVNQftAK 100 

There is also homology to SEQ ID 4272. 
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Based on this analysis, it was. predicted that this protein and its epitopes, could be usefial antigens for 
vaccines or diagnostics. 

Example 2299 

A DNA sequence (GBSx2436) was identified in S.agalactiae <SEQ ID 7083> which encodes the amino 
5 acid sequence <SEQ ID 7084>. This protein is predicted to be Structural protein. Analysis of this protein 
sequence reveals liie following: 

Possible site: 30 

»> Seems to have no N-terminal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0. 3 015 (Affirmative) < suco 

bacterial membrane Certainty!=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

15 The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAB18706 GB:U38906 Structural protein [Bacteriophage rlt] 
Identities = 57/127 (44%) , Positives = 83/127 (64%) 

Query: 5 IKAGTLFKPELVTEIMSKVKGHSTLAKLSGQTPIPFNGVEQFVFNLDGNAQIVGEGEQKL 64 

20 + GTLF P LVT+++SKV G S++A+LS Q PIPFNG + F F +D +V E +K 

Sbjct: 3 LNKGTLFDPTLVTDLISKVAGKSSIARLSAQKPIPFNGEKVFTFTMDSEIDWAESGICKT 62 

Query: 65 GNTAKOTSKIIKPLKFVYQftRMTDEFKYASEEKRLNFLKHYADGFAKKMaEAFD 124 
+ + + P+K Y AR++DEF yAS+E+++N L+ + DGFAKK+A D+ A HG 
25 Sbjct: 63 HGGVTIAPQTMVPIKVEYGS^RISDEFNnfASDEEKINILQEENDGFAKKVARGIDLMAFHG 122 

Query: 125 BEPRTMT 131 

+ PR T 
Sbjct: 123 VNPRLGT 129 

30 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2300 

A DNA sequence (GBSx2439) was identified in S.agalactiae <SEQ ID 7085> which encodes the amino 
35 acid sequence <SEQ ID 7086>. This protein is predicted to be surface protein Rib. Analysis of this protein 
sequence reveals the following: 

Possible site: 24 

»> Seems to have no N-terminal signal sequence 

40 ■ Final Results 

bacterial cytoplasm Certainty=0. 1892 (Affirmative) < succ> 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

45 No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useftil antigens for 
vaccines or diagnostics. 

Example 2301 

A DNA sequence (GBSx2440) was identified in S.agalactiae <SEQ ID 7087> which encodes the amino 
50 acid sequence <SEQ ID 7088>. Analysis of this protein sequence reveals the foUovmg: 
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Possible site: 39 

>» Seems to have no N-terminal signal sequence 



Pinal Results — 

5 bacterial cytoplasm — Certainty=0, 2227 (Affirmative) < suco 

bacterial membrane — Certainty^O. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
10 vaccines or diagnostics. 

Example 2302 

A DNA sequence (GBSx2441) was identified in S.agalactiae <SEQ ID 7089> which encodes the amino 
acid sequence <SEQ ID 7090>. This protein is predicted to be integrase. Analysis of this protein sequence 
reveals the following: 

15 Possible site: 37 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 2948 (Affirmative) < suco 
20 bacterial membrane — Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certaintyi=o. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9319> which encodes amino acid sequence <SEQ ID 9320> 
was also identified. 

25 The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB95616 GB:AJ400629 integrase [Streptococcus pneumoniae 
bacteriophage MMl] 
Identities = 84/238 (35%) , Positives = 137/238 (57%) , Gaps = 8/238 (3%) 

30 Query: 1 MTIiDiaSSSQaQKKaGLILQEKIEDRLAIiaraSEMTYGELKKEyLKQWIPTVKD 60 

+T++K + QA+ +A ++LQEKI +L+ + +T+ E+ + K W TVK+STK 
Sbjct: 30 VT^ffiKKIPQRRNQRAILIlQEKINKKLSTKQ\mSITFEEIYNLFyKSmQTVKESTKHNCK 89 

Query: 61 VSDSHIATVLPDDTIINKLTKRDIRLIIDKLiLKHNSYHVTHKCRKRLHAIFSYAIQMDYM 120 
35 D + V+P DTI+ L +R ++ I+K+++ NY K R RL IF+YA+Q Yh- 

Sbjct: 90 SVDKKMKEVIPSDTILaNLDRRFLQEAIEKIIESNGYITAKKVRHRLRGlEWYAVQYSyi 149 

Query: 121 TSNPTENVLVP-KPK--DDYKPEKVLYLTSNEV YDLCNRMIDNDEQTLADIVLFMFL 174 

+N + +P KPK ++ + ++ +LT E+ D+ NR Q AD+VL + L 

40 Sbjct: 150 ENNEVDYTTIPQKPKTLEELEKKRHNFLTMQEIKALVDVIJIRR--EYHQKyADMVLVL'rL 207 



45 



Query: 175 TGVRYGELSCLTYDKIDFENKEILINATYDFNTRXITTTKTKKSTRKISVSDNILDIV 232 

TG+RYGEL+ L IDFEN +11 +D + T KT S R I VS+++++ + 
Sbjct: 208 TGMRYGEIiTALQLKNIDFENNKIEITGNFDSVIsreiKTLPKTTNSIRTlKVSESVIEAI 265 

There is also homology to SEQ ID 578. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or dis^ostics. 

Example 2303 

50 A DNA sequence (GBSx2444) was identified in S.agalactiae <SEQ ID 7091> which encodes tlie amino 
acid sequence <SEQ ID 7092>. Analysis of this protein sequence reveals the following: 

Possible site: 50 

>?> Seems to have no N-terminal signal sequence 
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10 



Pinal Results 

bacterial cytoplasm Certainty=0. 2518 (Affirmative) <: suco 

bacterial membrane Certaiiity=0 . 0000 (Not Clear) < suco 

bacterial outside — Certain.ty=0 .0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
There is also homology to SEQ ID 4212: 

Identities = 92/144 (63%) , Positives = 118/144 (81%) , Gaps = 1/144 (0%) 

Query: 1 MPKYSLFELENGRRRLLASAGELQKGNKLALPTQFMKFLYLASRYNESKGKPEEIEKKQE 60 

+PKYSLFELENGR+R+LASAGELQKGNEIiaLP++++ FLYLAS Y + KG PE+ E+KQ 
Sbjct: 1198 LPKYSLFELENGRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGSPEDNEQKQL 1257 

15 Query: 61 BTOQHVSYFDDILQLINDFSKRVIIMftNLEKINKLYQDNKENISVDELRNNIINLETF^ 120 

FV QH Y D+I++ I++FSKRVIIjaDaNL+K+ Y +++ + E A NII+LFT T 
Sbjct: 1258 FVEQHKHYIiDEIIEQISEFSKRVILaDAHUJKVLSAYNKHR^ 1316 

Query: 121 SLGAPAAFKFFDKIVDRKRYTSTQ 144 
20 +LGAPAAFK+FD +DRKRYTST+ 

Sbjct: 1317 NLGAPAAFKYFDTTIDRKRYTSTK 1340 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

25 Example 2304 

A DNA sequence (GBSx2445) was identified in S.agalactiae <SEQ ID 7093> which encodes the amino 
acid sequence <SEQ ID 7094>. This protein is predicted to be 0- Analysis of this protein sequence reveals 
the following: 

Possible site: 48 
30 »> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -4.57 Transmembrane 239 - 255 ( 236 - 256) 

Final Results 

bacterial membrane Certainty=0 .2826 (Affirmative) < suco 

35 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB15253 GB:Z99120 similar to opine catabolism [Bacillus subtilis] 
40 Identities = 88/257 (34%) , Positives = 129/257 (49%) , Gaps = 11/257 (4%) 

Query: 1 MARLGRDFYSKLVTDI^JKDGFETKIYQQTGVFLLKKDESQriESLFAMDKRRLESPLIGD 60 

+A+ GA +Y L+ L+KDG Y++ G + D S+L+ + A KRR ++P IGD 
Sbjct: 61 LAKGGARYYKDLIHQLEKDGESDT6YKRVGAISIHTDASKLDKMEERAYKRREDAPEIGD 120 

45 

Query: 61 LQIMIKSEANTHFPEL-DGYEQLLYASGGARVEGADLTRILLEAS GVNVIKDEVHF- 115 

+ L+ SE FP L DGYE ++ SG ARV G L R LL A+ G VIK 
Sbjct: 121 ITRLSASETKKLFPILADGYES-VHISGAftRVNGRALCRSLLSABEKRGATVIKGNASLIi 179 

50 Query: 116 TITDNGFRVQGIDFDKLVLASGAWLRKIIjDEHNYQVDVRPQKGQLRDYYFSNINTG 171 

T+T + D +++ +GAW +IL V QK Q+ + ++ +TG 

Sbjct: 180 FENGTVTGVQTDTKQFAADAVIVTAGAWANEIIiKPLGIHFQVSFQKAQIMHFEMTDADTG 239 

Query: 172 iaPVVMPEGELDIIPFDNGKVSVGASHENDMAF-DIiNIDFKVI.DKFEEQAIGYFPQLKKQ 230 
55 +PVVMP + 1+ PDNG++ GA+HEND DL + + +A+ P L 

Sbjct: 240 SWPVVMPPSDQYILSFDN6RIWi.GATHENDAGLDDLRVTAGGQHEVLSKALAVAPGIiADA 299 

Query: 231 IRLLKRVEFVPIQVIFL 247 
+ RV F P FL 
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Sbjct: 300 AAVETRVGFRPFTPGFL 316 

There is also homology to SEQ ID 2656. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
5 vaccines or diagnostics. 

Example 2305 

A DNA sequence (GBSx2446) was identified in S.agalactiae <SEQ ID 7095> which encodes the amino 
acid sequence <SEQ ID 7096>. Analysis of this protein sequence reveals the following: 

Possible site: 60 
10 >» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 2572 (Affirmative) < suco 

bacterial membrane — Certainty=0.0000(Not Clear) < suco 

15 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9315> which encodes amino acid sequence <SEQ ID 9316> 
was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 

20 >GP:AAC00337 GB:AF008220 Ytql [Bacillus subtilis] 

Identities = 119/256 (46%) , Positives = 174/256 (67%) , Gaps = 3/256 (1%) 

Query: 6 QILDKIKEYDTIIIHRHmPDPDAIfiSQIGLRDIIRHNFPKKKVLATGFDEPTLAWIAKM 65 
+++ I YDTII+HRH+RPDPDA GSQ GL +I+R +P+K + A G EP+L+++ + 
25 Sbjct: 4 ELIRTISLYDTIILHRHVRPDPDAYGSQCGLTEILRKTYPEKNIFAVGTPEPSLSFLYSL 63 

Query: 66 DQVTDQDYQGALVWTDTAOTPRIDDERYKRGDFLIKIDHHPNDEVYGDLSYVDTNASSA 125 

D+V ++ y+GALV+V DTAN RIDD+RY G L+KIDHHPN++ YGDL +VDT+ASS 
Sbjct: 64 DEVDNETYEGALVIVCDTANQERIDDQRYPSGAKLMKIDHHPNEDPYGDLLWVDTSASSV 123 

30 

Query: 126 SEIVTDFAL---SCDLLLSTSaARVLYNGIVGDTGRFLYPATTSKTLKIASKLREEDPDF 182 

SE++ + L L+T AA ++Y GIVGDTGRFL+P TT RTLK A +L ++ F 

Sbjct: 124 SEMIYEIlYIlEGKEHGWKlm'KAAELIYAGIVGDTGRPLFE^^TEKTLKyAGELIQYPFSS 183 

35 Query: 183 SAMARQMDSFPFKIAKLQGFIFEQDKIDKHGAACVTLTQEDLKRFDVTDAETAAIVGVPG 242 

S + Q+ + KL GFIF+ + + +NGAA V + ++ Ij++F T +E + +VG G 

Sbjct: 184 SELFNQLYETmHVVKIiNGFIF®lVSLSENGRASVFIKKDTLEKFGTTM 243 

Query: 243 KIDIVESMAIF\7KQSD 258 
40 I + +W FV++ D 

. Sbjct: 244 NISGIRAWVFFVEEDD 259 

A related DNA sequence was identified in S.pyogenes <SEQ ID 7097> which encodes the amino acid 
sequence <SEQ ID 7098>. Analysis of this protein sequence reveals the following: 

45 Possible site: 61 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2584 (Affirmative) < suco 

50 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty= 0.0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 180/256 (70%) , Positives = 215/256 (83%) 

55 

Query: 4 FQQILDKIKEYDTIIIHRHIiWPDPDAHSSQIGLRDIIRHNFPKKKVLATGFDEPTIiAWIA 63 
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Sbjct: 


5 


Query: 


64 


Sbjct: 


65 


Query: 


124 


Sbjct: 


125 


Query: 


184 


Sbjct: 


185 


Query: 


244 


Sbjct: 


245 



-2580- 

F+ ILDKIK + TIIIHRH PDPDALGSQ GL++II NFP KKVL TGFDEP+LAWI+ 
FETIIJ3KIK3iHQTIIIHRHQNPDPDALGSQAGLKEIIAQNFPDKKVIiMrGFDEPSM^ 64 

KMDQVTDQDYQGALWVTDTANTPRIDDERYKKGDFLIKIDHHPNDEVYGDLSYVDTNAS 123 
+MDQVTD+DY+ MjV++TDTAN PRIDDERY G LIKIDHHPND+VYGD YVDT+AS 
QiroQVTDKDYKEALVIITDTANRPRIDDERYTLGKCLIKIDHHPNDDVyGDFYYVDTSAS 124 



SASEI+ DFA S +L LS AA++LY GIVGDTGRFLY +TTSKTL lAS+LR F+FDF+ 



A++RQMDSFP KIAKLQ ++PE L IDt+GAA V ++QB LK FDVT AE++AIV PGK 



10 



15 

Query: 244 

ID V++WAIFV+ +DG 
IDNVQAWAIFVELTDG 

20 Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or die^ostics. 

Example 2306 

A E)NA sequence (GBSx2447) was identified in S.agalactiae <SEQ ID 7099> which encodes the amino 
acid sequence <SEQ ID 7100>. Analysis of this protein sequence reveals the following: 

25 Possible site: 26 

>» Seems to have no N-terminal signal sequence 

Pinal Results 

bacterial cytoplasm Certainty=0 . 1846 (Affirmative) < suco 

30 bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0.0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB42949 GB:AL049863 putative adenosine deaminase [Streptomyces 
35 coelicolor A3 (2) ] 

Identities = 123/343 (35%) , Positives = 175/343 (50%) , Gaps = 26/343 (7%) 

LKELRKAELHCHIiDGSLSLPAIRiajftNMADIILPSSDK-EIiRKSVIAPAQTESLVDY^ 64 
L+ L KA IiH HLDG L + +LA LP++D EL + A + LV Y+ T 

LRRLPKAVLHDHIiDGGLRPATVVELARSVGHTLPTTDPDELAAWYYEAANSGDLVRYIAT 70 

FEFIRPLLQTKEALRFAAYDVARQAALENVIYIEIRFAPELSMDKGLTASDTVLAVLEGL 124 
FE ++Q +E L AA + A + V+Y E+R+APEL+ GL+ + V V EGL 

FEHTIAVMQNREGLLRAAEEYVLDLAADGVVYGEVRYAPEtNTRGGLSMREVVETVQEGL 130 





Query: 


6 


40 


Sbjct: 


11 




Query: 


65 


45 


Sbjct: 


71 




Query: 


125 




Sbj Ct : 


131 


50 


Query: 


176 




Sbjct: 


185 


55 


Query: 


228 




Sbjct: 


245 




Query: 


286 


60 ' 


Sbj ct : 


305 



-ALVCGMRQSSHKTTKDIIKHIVDLA PKGLVGFDFAGDEF 175 

L+CGMR D ++ DLA G+VGFD AG E 

3TLLCGMRMF DRVREAADLAVAFRDAGWGFDIAGAED 184 



+P +D + ++R P T+HftGE I +L + 6 +R+GH +T 



-GQRDLIKRFVEEDAVA-EMCLTSNLQTKAASSIQSFPYQELYDAGGKITINTDNRTVSD 285 
G+ + +V + +A EMC TSNLQT AA+SI P L D G ++T+NTDNR VS 



T +T+E SL V G +ED NA+K++F E+ L+ 
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No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2307 

5 A DNA sequence (GBSx2448) was identified in S.agalactiae <SEQ ID 7101> which encodes the amino 
acid sequence <SEQ ID 7102>. Analysis of this protein sequence reveals the following: 

Possible site: 18 

»> Seems to have no N-terminal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0 . 2042 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0, 0000 (Not Clear) < suco 

15 A related GBS nucleic acid sequence <SEQ ID 9639> which encodes amino acid sequence <SEQ ID 9640> 
was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 



20 



>C3P:CAB13290 GB:Z99111 similar to sulfite reductase [Bacillus subtilis] 
Identities = 63/146 (43%) , Positives = 87/146 (59%) , Gaps = 1/146 (0%) 

Query: 5 MMAKIVYASMTGOT-EEIADIVaDKLRDLGLDVEVEECTMVDAAD-FEDADlAIVATYTY 63 

MA +VYA+M+GNTE +AD++ L++ +V+ E +DA FDD 1+ TYT+ 
Sbjct: 1 MAKlLLVYATMSGNTEftMADLIERGLQEALaEVDRETamiDDAQLFTDYDHVIM 60 

25 Query: 64 GDGDLPDEIVDFYEDLAEVDLSGKVYGWGSGDTFYDYFCKSVDEFEAQFALTGAQKGAD 123 

GDGDLPDE +D ED+ E+D SGK V GSGDT Y++FC +VD EA+ G 
Sbjct: 61 GDGDLPDEFLDLVEDMEEIDFSGKTCAVFGSGDTAYEFFCGAVDTIiEAKIKERGGDIVLP 120 

Query: 124 CVKVDLAAEDEDIENLEAPAEEIASK 149 

30 VK++ EE+EL F + AK 

Sbjct: 121 SVKIENNPEGEEEEELINFGRQFAKK 146 

A related DNA sequence was identified in S.pyogenes <SEQ ID 7103> which encodes the amino acid 
sequence <SEQ ID 7104>. Analysis of this protein sequence reveals the following: 

35 Possible site: 14 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 1641 (Affirmative) < suco 

40 bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — CertaintyisO. 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

Identities = 116/147 (78%) , Positives = 136/147 (91%) 

45 

Query: 5 MAlJUCIVYASMTGim;EIADIVADKLRDLGIJ)VEVEECT^m)AaDFEDADIAI^ 64 

MRLAKIVYASMTGNTEEIADIVA+KL++LG DV+++ECT VDA++FE+ADIA+VaTYTYG 
Sbjct: 1 MRLAKIVYASMTGNTEEIADIVANKLQELGHDVDIDECTTVDASEFENaDIAVVATYTYG 60 

50 Query: 65 DGDLPDEIVDFYEDLAEVDLSGKVYGWGSGDTFYDYFCKSVDEFEAQFALTGAQKGADC 124 

DGDLPDEIVDB^DL ++DL GK+YGWGSGDTFYDYFCKSVD+F QFALTGA KGA+ 
Sbjct: 61 DGDLPDEIVDFYEDLQDLDLEGKIYGWGSGDTFYDYFCKSVDDFSEQFALTGAIKGAEP 120 

Query: 125 VKVDLAAEDEDIENLEAFAEEIASKUSr 151 
55 VKVDLAAEDEDI+ LEAFAE+++ +N 

Sbjct: 121 VKVDLAAEDEDIDRLEAPAEQIiSQAVN 147 
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Based on this analysis, it was predicted that lhese proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2308 

5 A DNA sequence (GBSx2449) was identified in S.agalactiae <SEQ ID 7105> which encodes the amino 
acid sequence <SEQ ID 7106>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

»> Seems to have no IT-tenninaX signal sec[uejace 

10 Pinal Results 

bacterial cytoplasm Certainty=0 . 3568 {Affirmative) < suco 

bacterial membrane — Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

15 The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAB98234 GB:TJ67480 chorismate mutase/prephenate dehydratase 
(pheA) [Methanococcus jannaschii] 
Identities = 26/85 (30%), Positives = 46/85 (53%), Gaps = 1/85 (1%) 

20 Query: 2 ELEEIRQEIDEIDQQIiVSLLETRMGLILEVIAFKKKHRLPVLDNNraSNEVIiI^^ 61 

+L EIR++IDEID +++ L+ R L +V K + +P+ D RE + + + K + 
Sbjct: 4 KIiaEIRKKIDEIDNKILKLX2iERNSIiAKDVaEIKNQLGIPINDP^:REKyiYDRIRKLCKE 63 

Query: 62 HQFDDVIRATFKDIMTE-SRVYQKE 85 
25 H D+ I 1+ E ++ QK+ 

Sbjct: 64 HNVDEasriGIKIFQILIEHNKaLQKQ 88 

There is also homology to SEQ ID 1568. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
30 vaccines or diagnostics. 

Example 2309 

A DNA sequence (GBSx2450) was identified in S.agalactiae <SEQ ID 7107> which encodes the amino 
acid sequence <SEQ ID 7108>. This protein is predicted to be a minor structural protein. Analysis of this 
protein sequence reveals the following: 

35 Possible site: 23 

>>> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 1828 (Affirmative) < suco 
40 bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:3U^C34413 GB:AP158600 putative minor structural protein 
45 [Streptococcus thermophilus bacteriophage Sfill] 

Identities = 39/65 (60%) , Positives = 54/65 (83%) 

Query: 1 MEVETDSQEVLMSTGLKDLKAHAYPAITYEVDGYVDLELGDWRIQDDGYEPPLILTARV 60 
ME++TDS++VI1+ST L++L+ YPAITYEVDG++DL++GD V+IQD G+ P L+L ARV 
50 Sbjct: 707 MEIDTDSEDVLISTAt)RNIlRKFCypAITYEVDGFIX)LDIGDTVKIQDTGPSP^alMLEaRV 766 

Query: 61 VEQDI 65 

EQ I 

Sbjct: 767 SEQQI 771 
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No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

5 Example 2310 

A DNA sequence (GBSx2451) was identified in S.agalactiae <SEQ ID 7109> which encodes the amino 
acid sequence <SEQ ID 7110>. This protein is predicted to be phosphomethylpyrimidine kinase (thiD). 
Analysis of this protein sequence reveals the following: 

Possible site: 45 
10 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2051 (Affirmative) < suco 

bacterial membrane Certainty= 0.0000 (Not Clear) < suco 

15 bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP':AAC22074 GB:U32725 phosphomethylpyrimidine kinase (thiD) 
[Haemophilus influenzae Rd] 
20 Identities = 29/78 (37%) , Positives = 48/78 (61%) , Gaps = 2/78 (2%) 



25 



Query: 4 i^NVLAISGKTOIFSGGGLHADIiATyVVNKLHGFVAVTCLTAMSDKG-FEVIPIEASILKQQ 62 

+ VL I+G+D G G+ ADL T+ + + G' A+T +TA + G F++ PI ++ Q 
Sbjct: 5 KQVLTIAGSDSGGGftGIQM)LKTFQmGVraTSAITAVTAQl(roiK3VFDIHPIPLKri^ 64 

(Juery: 63 LESLK-DVEFGSIKLGLL 79 

LE++K D + S K+G+L 
Sbjct: 65 LEAVKNDFQIASCKIGML 82 

30 There is also homology to SEQ ID 4408. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2311 

A DNA sequence (GBSx2452) was identified in S.agalactiae <SEQ ID 7111> which encodes the amino 
35 acid sequence <SEQ ID 7112>. Analysis of this protein sequence reveals the following: 

Possible site: 57 

>» Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -7.43 Transmembrane 109 - 125 ( 102 - 129) 
INTEGRAL Likelihood = -1.28 Transmembrane 84 - 100 ( 84 - 100) 

40 



45 



50 



Final Results 

bacterial membrane Certainty=0. 3972 (Affirmative) < suco 

bacterial outside — Certainty^O. 0000 (Not Clear) < suco 
bacterial cytoplasm — Certainty= 0.0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CaA22372 GB:AL034446 putative transmembrane protein 
[Streptomyces coelicolor A3 (2) ] 
Identities = 25/93 (26%) , Positives = 43/93 (45%) , Gaps = 1/93 (1%) 

Query: 62 SASVEILCRGWLLPVSATKYSKIVSVSISSIFFGLLHSANNHVSLISIENLCL-FGLFLS 120 

+A+ E++ RG L + +++ ++ + FGL+H N +L + + G L+ 

Sbjct: 143 AATEEWFRGVLFRIIEEHIGTYLALGLTGLVFGLMHLLNEDATLWGAIAIAIEAGFMLA 202 
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Query: 121 LYVILKGNIWGACGIHGAWNCVQGSVFGIEVSG 15.3 

N+W G+H WN G VF VSG 
•Sbjct: 203 AAYAATRNLWLTIGVHFGWNFAAGGVFSTWSG 235 

5 . • 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, covild be usejEul antigens for 
vaccines or diagnostics. 

Example 2312 

10 A DNA sequence (GBSx2453) was identified in S.agalactiae <SEQ ID 7113> which encodes the amino 
acid sequence <SEQ ID 7114>. This protein is predicted to be pppL protein. Analysis of this protein 
sequence reveals the following: 

Possible site: 45 

»> Seems to have no N-terminal signal sequence 

15 

Final Results 

bacterial cytoplasm Certainty=0 . 5796 (Af f irttiative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

20 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAA10712 GB:AJ132604 pppL protein [Lactococcus lactis] 
Identities = 38/64 (59%) , Positives = 51/64 (79%) 

25 Query: 1 MEISLLTDIGQRRSNNQDFINQFENKAGVPLIILADGMGGHRAGNIASEMTVTDLGSDWA 60 

ME S+L+DIG +RS NQD++ + N+AG L +LaDGMGGH+AGN+AS++TV DLG W+ 
Sbjct: 1 MEYSILSDIGSKRSTNQDYVGTYVNRftfiyQLFLLADGMGGHKAGNVRSKLTVEDIXSKLWS 60 

Query: 61 ETDF 64 
30 ET F 

Sbjct: 61 ETFF 64 

There is also homology to SEQ ID 3022: 

Identities = 58/74 (78%) , Positives = 69/74 (92%) 

35 

Query: 1 MEISLLTDIGQRRSNNQDFINQFENKAGVPLIILADGMGGHRAGNIASEMTVTDLGSDWA 60 

M+ISL TDIGQ+RSNNQDFIN+F+NK G+ L+ILADGMGGHRAGNIASEMTVTDLG +W 
Sbjct: 1 MKISLKTDIGQKRSNNQDFINKFDNKKGITLVILftDGMGGHRAGNIASEMTVTDLGREWV 60 

40 Query: 61 ETDFSELSEIRDMM 74 

+TDF+ELS+IRDW+ 
Sbjct: 61 KTDPTELSQIRDWL 74 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
45 vaccines or diagnostics. 

Example 2313 

A DNA sequence (GBSx2454) was identified in S.agalactiae <SEQ ID 7115> which encodes the amino 
acid sequence <SEQ ID 7116>. This protein is predicted to be sunL protein. Analysis of this protein 
sequence reveals the following: 

50 Possible site: 25 

»> Seems to have no N-terminal signal sequence 

Final Results 
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bacterial cytoplasm Certaintyi=0. 1631 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside -7- Certainty=0. 0000 (Not Clear) < suco 



The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAA10711 GB:AJ132604 stmL protein [Lactococcus lactis] 
Identities = 48/81 (59%) , Positives = 67/81 (82%) 

Query: 1 MSILSSVCQTLRKGGIITYSTCTIFEEENFQVIEKFLENHPNFEQVELSHTQEDIVKRGC 60 

+ IL+S ++L+K GI+ YSTCTIP+EENF V+ +FLENHENFEQVE+S+ + +++K GC 
Sbjct: 342 LEILNSaSKSLKKSGIMVYSTCTIPDEENFDWHEFLENHPNFEQVEISNEKPEVIKEGC 401 

Query: 61 ISISPEQYHTDGFFIGQVKRl 81 

+ I+PE YHTDGFFI + K+I 
Sbjct: 402 LFITPEMYHTDGFFIAKFKKI 422 

There is also homology to SEQ ID 3018: 

Identities = 64/82 (78%) , Positives = 74/82 (90%) 

Query: 1 MSILSSVCQTLRKGGIITYSTCTIFEEENFQVIEKFLENHPNFEQVELSHTQEDIVKRGC 60 

+ ILSSVCQTLRRGGIITYSTCTIF+EEN QVIE FL++HPNFEQV+L+HTQ DIVK G 
Sbjct: 359 LEILSSVCQTIiRRGGIITYSTCTIFDEEmQVIEAFI<3SHPNPEQVKIiNHTQftDIVKDGY 418 

Query: 61 ISISPEQYHTDGFFIGQVKRIL 82 

+ I+PEQY TDGFFIGQV+R+L 
Sbjct: 419 LIITPEQYQTDGFFIGQVRRVL 440 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2314 

A DNA sequence (GBSx2455) was identified in S.agalactiae <SEQ ID 7117> which encodes the amino 
acid sequence <SEQ ID 71 18>. This protein is predicted to be PTS permease for mannose subunit IIPMan. 
Analysis of this protein sequence reveals the following: 

Possible site: 53 

»> Seems to have no N-terminal signal sequence 



IMTEGRMi 


Likelihood = 


-9, 


.18 


Transmembrane 


32 


- 48 


( 


30 


- 58) 


INTEGRAL 


Likelihood = 


-8, 


.07 


Transmembrane 


127 


- 143 


( 


122 


- 146) 


INTEGRAL 


Likelihood = 


-2, 


.07 


Transmembrane 


56 


- 72 


( 


56 


- 72) 


INTEGRAL 


Likelihood = 


-1. 


,44 


Transmembrane 


87 


- 103 


( 


86 


- 103) 


INTEGRAL 


Likelihood = 


-0, 


.53 


Transmembrane 


105 


- 121 


{ 


105 


- 121) 



Final Results 



bacterial membrane 
bacterial outside 
bacterial cytoplasm 



- Certainty=0. 4673 (Affirmative) < suco 

- Certainty=0 . 0000 (Not Clear) < suco 

- Certainty=0. 0000 (Not Clear) < suco 



The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAP81084 GB:AF228498 AgaW [Escherichia coli] 
Identities = 38/122 (31%) , Positives = 68/122 (55%) , Gaps = 7/122 (5%) 



Query: 25 KVPETKSIIRLTALAFLVCSILWELVSMRELISSISFIGILVGSGPVNSFVHHIPQNLM 84 

++P T + L A +L L+++ +F+ I G+ + + +PQ L+ 

Sbjct: 126 RMPRTPILAALNACNYLA LLALOIFYFLCAPLPIYPGAEHRKTIIDVLPQRLI 178 



Query: 85 NGLSAAGGLLPAVGFAML^mlLlmi^KLAVFYLLGFVLTAYLKLPAVAV2iALGAVIC^^ 144 

+GL AG6++PA+GFA+L+K++ N +++LGFV A+LKLP +A+A + +1 
Sbjct: 179 DGLGVAGGlMPAIGFAVLLKimKNVYIPYFILGFVAAAWLKLPVLRIACPALAMALIDL 238 



Query: 145 QR 146 
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R 

Sbjct: 239 LR 240 

There is also homology to SEQ ID 1636: 

5 Identities = 104/109 (95%) , Positives = 108/109 (98%) 

Query: 56 LISSISFIGILVGSGPVNSFVHHIPQNLMNGLSAAGGLLPAVGFAMLMKLLWTNKLAVFY 115 

+I+SISFIGiriVGSGPVN+FV HIPQNLMNGLSAAGGLLPAVGFAMLMKLLWTNKIAVFY 
Sbjct: 149 IIASISFIGILVGSGPVNAFVEHIPQ^MINGLSAAGGLLPAVGPiyyILMKLL^^ 208 

10 

Query: 116 IJjGPVLTAYLKLPAVAVAALGAVICVISSQRDIELEIAITRGAISKQTTF 164 

LIX3FVLTAYLKLPAVAVAALGAVICVISSQRD+ELDAITRGAISRQTTF 
Sbjct: 209 LLGPVLTAYI.KLPAVA^ffiALGAVlCVISSQRDLEIlDAITRGAISKQTTF 257 

15 Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagjiostics. 

Example 2315 

A DNA sequence (GBSx2456) was identified in S.agalactiae <SEQ ID 7119> which encodes the amino 
acid sequence <SEQ ID 7120>. Analysis of this protein sequence reveals the following: 

20 Possible site: 50 

>» Seems to have a cleavable N-term signal seq. 
INTEGRAL Likelihood = -8.12 Transmembrane 
lOTEGRAL Likelihood = -5.52 Transmembrane 
INTEGRAL Likelihood = -5.20 Transmembrane 

25 

Pinal Results 

bacterial membrane Certainty=0.4248 (Affiirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintys=0. 0000 (Not Clear) < suco 

30 

The protein has homology with the following sequences in the GENPEPT database. 

>6P:CAB159e3 6B:Z99124 phosphotransferase system (FTS) 

beta-glucoside-specif ic enzyme IIABC conponent [Bacillus subtilis] 
Identities = 76/201 (37%) , Positives = 122/201 (59%) , Saps = 3/201 (1%) 

35 





Query: 


1 


MIKALLALLLVFKILTPSSQTYILLNLFADGVFYPLPILIAITAAQKLKANPILALGTW 


60 








MIK L+AL ' + F + SQ +++L DG FYFLP+L+A++AA+K +NP +A 






Sb j ct : 


121 


MIKGLVALAVTFGWMAEKSQVHVILTAVGDGAFYFLPLLLAMSAARKFGSNPYVAAAIAA 


180 


40 


Query: 


61 


MLLHPNWANLVASGKPVSLPHTIPFTLTNYASSVIPIILIICVQAYIEKYLKQIIPKSLR 


120 








+LHP+ L+ +GKP+S F +P T Y+S+VIPI+L I + +Y+EK++ + SL+ 






Sb j ct : 


181 


AILHPDLTALLGAGKPIS-FIGLPVTAATYSSTVIPILLSIWIASYVEKWIDRFTHASLK 


239 




Query: 


121 


LVLVPMLIFLSMGILSFSILGPMGTIAGQYLAVIFTFLSKYASW-APAFLVGAFAPILIM 


179 


45 






L++VP L + L+ +GP+G I G+YL+ +L +A A FL G F+ ++IM 






Sb j ct ! 


240 


LIWPTPTLLIWPLTLITVGPLGAILGEYLSSGVNYLFDHAGLVBMIFLRGTPS-LIIM 


298 




Query: 


180 


FGVHSGIAALGITQLAKLGVD 200 










G+H + I +A+ G D 




50 


Sbj ct : 


299 


TGMHYAEVPIMINNIAQNGHD 319 





There is also homology to SEQ ID 2884. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 



121 - 137 ( 118 - 144) 
91 - 107 ( 89 - 111) 
166 - 182 ( 162 - 192) 
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Example 2316 

A DNA sequence (GBSx2457) was identified in S.agalactiae <SEQ ID 7121>' which encodes the amino 
acid sequence <SEQ ID 7122>, This protein , is predicted to be glucose kinase. Analysis of this protein 
sequence reveals the following: 

Possible site: 54 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0 . 1180 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty4=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CaB14416 GB:Z99116 glucose kinase [Bacillus subtilis] 
Identities = 32/57 (56%) , Positives = 41/57 (71%) 

Query: 1 MVIGGGVSAAGEFLRSRVEKyFVTFAFPQVKKSTKIKIAELGNDAGIIGAASLaNQQ 57 

+V+GGGVS AGE LRS+VEK F AFP+ ++ I lA LGNDAG+IG A +A + 
Sbjct: 258 IVLGCSGVSRAGELLRSKVEKTFRKCAFPRAAQAftDISIARI/SNnfiSVIGGAWIA^ 314 

There is also homology to SEQ ID 198. An alignment of the GAS and GBS proteins is shown below: 

Identities = 50/56 (89%) , Positives = 53/56 (94%) 

Query: 1 MVIGGGVSAAGEFLRSRVEKYFVTFAFPQVKKSTKIKIAELGNDAGIIGAASLANQ 56 

+VIGGGVSAAGEFLRSR+EK5fFVTF FPQV+ STKIKIAELGNDAGIIGAASIA Q 
Sbjct: 264 WIGGGVSaAGEFLRSRIEKyFVTFTFPQVRYSTKIKIAELGNDAGIIGAASLaRQ 319 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2317 

A DNA sequence (GBSx2458) was identified in S.agalactiae <SEQ ID 7123> which encodes the amino 
acid sequence <SEQ ID 7124>. Analysis of this protein sequence reveals the following: 

Possible site: 19 

>» Seems to have a cleavable N-term signal seq. 

Pinal Results 

bacterial outside Certainty=0. 3000 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty^O. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB14385 GB:Z99116 similar to hypothetical proteins [Bacillus stibtilis] 
Identities = 37/86 (43%) , Positives = 51/86 (59%) 

Query: 3 MSVILIIVILLAFVAWASWNYWRVRRAAKFLDNESFQKEMSRGQLIDIREAGAFHRKHIL 62 

MS +++++I AF+ + +Y +R K L E F+ + QLID+RE F HIL 
Sbjct: 1 MSNMIVLIIFPAPIIYMIASYVyQQRIMKTLTEEEERAGYRKaQLIDVREENEPEGGHIL 60 

Query: 63 GRRNIPASQFKVALSRLRKDKPVLLY 88 

GRENIP SQ K + +R DKPV LY 
Sbjct: 61 GaRNIPLSQI.K!QRKNEIRTDKPVyi.Y 86 

There is also homology to SEQ ID 202. An alignment of the GAS and GBS proteins is shown below: 

Identities = 51/108 (47%) , Positives = 70/108 (64%) 
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Query: 1 MDMSVILIlVlLLaWAWASWiroTOVRRAAKFLDNESFQKEMSRGQLIDIREAG^ 60 

M +++ ++L+ V + +WNY+ R+ AK +DNE+F+ M +GQIj1D+RE AF KH 
Sbjct: 1 MSPITLILWLLLVGIVGYYTWNYFSFRKMAKQVDNETFKDVMRQGQLIDLREPAAFRTKH 60 

5 Query: 61 ILGftRNIPASQFKVaLSftLRKDKPVLLYDASRGQSIPRIVLLLRKERF 108 

ILGftRN PA QF A+ LRKDKPVL+V+ R Q V L+K F 
Sbjct: 61 ILCaftRNFPAQQFDAAIKGLRKDKPVLIYElSMRPQYRVPAVKKLKKAGP 108 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
10 vaccines or diagnostics. 

Example 2318 

A DNA sequence (GBSx2459) was identified in S.agalactiae <SEQ ID 7125> which encodes the amino 

acid sequence <SEQ ID 7126>. This protein is predicted to be svuface protein Rib. Analysis of this protein 

sequence reveals the following: 

15 Possible site: 24 

>» Seems to have no N-teinninal signal sequence 

Final Results 

bacterial cytqplastn — Certainty=0. 1892 (Affirmative) < suoo 

20 bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside CertaintysO . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
25 vaccines or diagnostics. 

Example 2319 

A DNA sequence (GBSx2460) was identified in S.agalactiae <SEQ ID 7127> which encodes the amino 
acid sequence <SEQ ID 7128>. Analysis of this protein sequence reveals the following: 

Possible site: 18 
30 >» Seems to have no N-terminai signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 3522 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

35 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
40 vaccines or diagnostics. 

Example 2320 

A DNA sequence (GBSx2461) was identified in S.agalactiae <SEQ ID 7129> which encodes the amino 
acid sequence <SEQ ID 7130>. Analysis of this protein sequence reveals the following: 

Possible site: 25 
45 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 2770 (Affirmative) < succ> 

bacterial membrane Certainty^O. 0000 (Not Clear) < suco 
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bacterial outside Certainty=0 . 0000 (Not . Clear) < suco 

The protein has homology with Ihe following sequences in the GENPEPT database. 

>GP:AAB18708 GB:U38906 ORF33 [Bacteriophage rlt] 
5 Identities = 56/85 (65%) , Positives = 66/85 (76%) , Gaps = 1/85 (1%) 

Query: 1 MTNFATTDDVILLWRQLSVDEIKRAEALLETVSDTLRLEASKVGKNLDEMILETP-YFAT 59 

M FAT DD+ +LWR L DE +RAE LLE VSD+LR EA KVG++L MI E P YFA+ 
Sbjct: 1 MNPFATVDDLTMLWRPLKGDEKERAEKLLEIVSDSLREEftDKVGRDLYAMIAEKPSYFAS 60 

10 

Query: 60 VLKSVTVDIV2UITLMTATQGEPMSQ 84 

V+KSVTVDIVARTIiMT+T EPM+Q 
Sbjct: 61 WKSVTVDIVARTLMTSTDQEPMTQ 85 

1 5 There is also homology to SEQ ID 1432. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2321 

A DNA sequence (GBSx2462) was identified in S.agalactiae <SEQ ID 7131> which encodes the amino 
20 acid sequence <SEQ ID 7132>. This protein is predicted to be regulatory protein TypA (typA). Analysis of 
this protein sequence reveals the following: 

Possible site: 41 

>» Seems to have no N- terminal signal sequence 

25 Pinal Results 

bacterial cytoplasm Certainty=0 .2238 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0.0000(Not Clear) < suco 



30 The protein has homology with the following sequences in the GENPEPT database. 

>GP:BAB06351 GB:AP001516 GTP-binding protein TypA/BipA (tyrosine 
phosphorylated protein A) [Bacillus halodurans] 
Identities = 175/237 (73%) , Positives = 204/237 (85%) , Gaps = 1/237 (0%) 

35 Query: 1 MEDIFVGETVTPTDAIEPLPVLRIDEPTLQMTFLVNNSPFAGREGKWITSRKVEERLLAE 60 

ME+I VGETV P D +PLP+LRIDEPTLQMTFIjVNNSPFAGREGK +TSRK+EERL AE 
Sbjct: 281 MEEINTOETVCPVDHQDPLPILRIDEPTIiQMTFLVNNSPFAGREGICHVTSRKLEERLRAE 340 

Query: 61 LQTDVSLRVDPTDSPDKWTVSGRGELHLSILIETMRREGYELQVSRPEVIIKEIDGVQCE 120 
40 L+TDVSLRVh- TDSPD W VSGRGELHLSILIE MRREGYEIiQVS+PEVII+EIDGVQCE 

Sbjct: 341 LETDVSLRVENTDSPDMWWSGRGELHLSILIENMRREGYELQVSKPEVIIREIDGVQCE 400 

Query: 121 PFERVQIDTPEEYQGAIIQSIjSERKGDMIiDMQMViGNGQTRLIFLIPARGLIGYSTEFLSM 180 

P ERVQID PEEY GA+++SL ERKG+ML+M G+GQ RL F++PARGLIGY+TEFLS 
45 Sbjct: 401 PVERVQIDVPEEYTGAVMESLGERKGEMLNMTNTGSGQVRLEFMVPARGLIGYTTEFLSQ 460 

Query: 181 TRGYGIMNHTFDQYLPWQGEIGGRHRGALVSIENGKATTYSIMRIEERGNLSFVNP 237 

TRGYGl+MH+FD Y PV G++GGR +G LVS+E GKAT Y I+++E+RG + PV P 
Sbjct: 461 TRGYGIINHSFDSYQPVTPGQVGGRRQGVLVSMETGKATQYGIIQVEDRGTI-FVEP 516 

50 

There is also homology to SEQ ID 206. An alignment of the GAS and GBS proteins is shown below: 

Identities = 228/237 (96%) , Positives = 233/237 (98%) , Gaps = 1/237 (0%) 

Query: 1 MEDIFVGETVTPTDAIEPLPVLRIDEPTLQm'FLVNNSPFAGREGKKflTSRKVEERLLaE 60 

55 MEDIFVGET+TPTD +E LP+LRIDEPTLQMTFLVNWSPFAGREGKWITSRKVEERLLAE 

Sbjct: 284 MEDIFVGETITPTDCVEALPILRIDEPTLQMTFLVNNSPFAGREGKWITSRKVEERLLAE 343 



Query: 61 LQTDVSLRVDPTDSPDKWTVSGRGELHLSILIETMRREGYELQVSRPEVIIKEIDGVQCE 120 
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I^?rDVSLRVDPTDSPDKWrVSGRGELHLSILIETMRREGYELQVSRPEVIIKEIDGV+CE 
Sbjct: 344 LQTDVSLRVDPTDSPDKWTVSGRGELHLSILIETMRREGYELQVSRPEVIIKEIDGVKCE 403 

Query': 121 PFERVQIDTPEEYQGAIIQSLSERKGDMLDMQMVGNGQTRLIFLIPARGLIGYSTEFLSM 180 
5 PFERVQIDTPEEYQGAIIQSLSERKGDMLDMQMVGNGQTRLIFLIPARGLIGYSTEFLSM 

Sbjct: 404 PFERVQIDTPEEYCJGAIIQSLSERKGDMLDMQMVGNGQTRLIFIilPlUlGLIGYSTEFIiSM 463 

Query: 181 TRG;YGIMiraTFDQYLPWQGElGGRHRG?aJVSIENGKATTYSimiEERGKLSFVro 237 
TRGYGIMNHTFDQYLPWQGEIGGRHRGALVSIENGKATTYSIMRIEERG + FVNP 
10 Sbjct: 464 TRGYGIMNHTFDQYLPWQGEIGGRHRGALVSIENGKATTYSIMRIEERGTI-FVNP 519 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2322 

15 A DNA sequence (GBSx2464) was identified in S.agalactiae <SEQ ID 7133> which encodes the amino 
acid sequence <SEQ ID 7134>. This protein is predicted to be pseudouridine synthase family 1 protein 
(rluB). Analysis of this protein sequence reveals the following: 



20 



25 



Possible site: 34 

»> Seems to have no N- terminal signal sequence 



Final Results 

bacterial cytoplasm Certaiiity=0 . 1950 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CaB14248 GB: 299116 similar to hypothetical proteins [Bacillus subtilis] 

Identities = 59/105 (56%) , Positives = 85/105 (80%) 

30 Query: 5 VKERIYPVGRLDWDTTGLIiILOTmGDFTDKMIHPRNEIDKVYLARVKGIATKENLR 64 

+ +RIYP+GRLD+OT+GLL+LTNDG+F +K++HP+ EIDK Y+A+VKGI KE LR L R 
Sbjct: 91 IPQRIYPIGRIJDYDTSGLLLLTNDGEFANKI^HPKYEIDKTyvaKVKBIPPKELIiR^^ 150 

Query: 65 GWIDGKKTKPJ\RYTIIKVDHEKNRSWEI.TIHEGRNHQVKKMFE 109 
35 G+ ++ KT PA+ ++ +D +K S+++LTIHEGRN QV++MFE 

Sbjct: 151 GIRIiEEGKTAPAKaKLLSLDKJCKQTSIIQLTIHEGRNRQVRRMFE 195 

There is also homology to SEQ ID 4728: 

Identities = 96/109 (88%) , Positives = 106/109 (97%) 

40 

Query: 1 ■ MLPQVKERIYPVGRLDWDTTGLLILTIiTCDFTDKMXHPRNEIDKVYLARVKGIATKENLR 60 

+LPQVKERIYPVGRLDWDT+G+L1LTNDGDFTD MIHPKNEIDKVYLftRVKGlATKENLR 
Sbjct: 94 LLPQVKERIYPVGRICraWSGVLILTiroGDFTDTMIHPRNEIDKVYLaRVKBIATKENLR 153 

45 Query: 61 PLTRGWIDGKKTKPARYTIIKVDHEKNRSWELTIHEGRNHQVKKMFE 109 

PLTRG+VIDGKKTICPARY I++V+ +K+RS+VELTIHEGRNHQVKKMFE 
Sbjct: 154 PLTRGIVIDGKKTKPARYNIVRVERDKSRSIVELTIHEGRMHQVKKMFE 202 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
50 vaccines or diagnostics. 

Example 2323 

A DNA sequence (GBSx2466) was identified in S.agalactiae <SEQ ID 7135> which encodes the amino 
acid sequence <SEQ ID 7136>. This protein is predicted to be L-ribulose 5-phosphate 4-epiinerase. 
Analysis of this protein sequence reveals the following: 
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Possible site: 19 

»> Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0. 2827 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:ARD45716 GB:AF160811 L-ribulose 5-phosphate 4-epimerase 

[Bacillus stearothermophilus] 
Identities = 68/103 (66%) , Positives = 82/103 (79%) 

Query: 2 QEMRERVCEflNKSLPVHSLVKFTWGNVSEVDREAGLIVIKPSGVDYDQLTPENMWTDLE 61 

+E+++ V EAN IiP + LV FTWGNVS +DRE GL+VIKPSGV YD+LT ++^WV DL 
Sbjct: 3 EELKQAVLEftNLQLPQYRLVTFTWGNVSGIDRERGLWIKPSGVAYDKLTIDDMVVVDLT 62 

Query: 62 GNIVEGDIiNPSSDLPTHVQLYKAWPEVGGIVHTHSTEft.VGWaQ 104 

GN+VEGDL PSSD PTH+ LYK +P +GGIVHTHST A WAQ 
Sbjct: 63 GNWEGDLKPSSDTPTHLWLYRQFPGIGGIVHTHSTWATVWAQ 105 

There is also homology to SEQ ID 4600: 

Identities = 93/103 (90%) , Positives = 96/103 (92%) 

Query: 2 QEMRERVCEANKSLPVHSLVKFTWGNVSEVDREAGLIVIKPSGVDYDQLTPENMWTDLE 61 

QEMRERVC ANKSLP H LVKFTWGNVSEV RE G IVIKPSGVDYD LTPENMWTDL+ 
Sbjct: 6 QE^mRVCAANKSIlPQHGI.VKFTWGNVSEVCREK3RIVIKPSGVDyDLLTPEHMVVTDLD 65 

Query: 62 GMIVEGDmPSSDLPTHVQLYKAWPEVGGIVHTHSTEAVGWAQ 104 

COT+VEGDIiNPSSDLPIOT+LYKAWPEVGGIVHTHSTEAVGWAQ 
Sbjct: 66 GNWEGDIiNPSSDLPTHVELYKAWPEVGGIVHTHSTEAVCSWAQ 108 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2324 

A DNA sequence (GBSx2467) was identified in S.agalactiae <SEQ ID 7137> which encodes the amino 
acid sequence <SEQ ID 7138>. Analysis of this protein sequence reveals the following: 

Possible site: 20 

>» Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0 . 3452 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0.0000(Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAG05712 GB:AE004658 hypothetical protein [Pseudomonas aeruginosa] 
Identities = 141/200 (70%) , Positives = 162/200 (80%) , Gaps = 1/200 (0%) 

Query: 10 LSLGTDYETLANRFRPIFREISAGNVEREKARALPYEPIEWLKKAGFGAVRVPSEYGOAG 69 

LS G DYE LA RFRPIF 1+ G VERE+ R LP+E I WLK+AGFGAVRVP E+GGAG 
Sbjct: 14 LSEGADYELIAQRFRPIFARIAEGAVERERQRELPHEAIAWLKQAGFGAVRVPREHGGAG 73 

Query: 70 ASIGQLFQLLIELAEADSNIPQALRAHFAPVEDRIJIAPPGVDRDTWFARFVAGDLVGNGW 129 

AS+ QL QLLIELAEADSNI QALR HFAFVEDRLNA PG RD W RFV GDLVG W 
Sbjct: 74 ASLPQLVQLLIELAEADSNITQALRGHFAFVEDRLNAEPGPGRDRWLRRFVEGDLVGCAW 133 



Query: 130 TEVGTVKIGDVITKVSAQGDG-FVMIGTKPYSTGSIFADWIDVYAQRADNGADVIAWNA 188 
TEVG+V++G+V+T+VS + DG +V+NG+K+YSTGS+F+DWID+YAQR D GADVIA + 
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Sbjct: 134 TEVGSVMjGEVLTRVSRKDDGRWVVNGSKYYSTGSLPSDWIDLYAQRDDTGADVIAaiRT 193 

Query': 189 RHAGVRHSDDWDGFGQRTTG 208 • • 

GVR SDDWDGFGQRTTG 
5 Sbjct: 194 DQPGVRQSDDWDGFGQRTTG 213 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

10 Example 2325 

A DNA sequence (GBSx2468) was identified in S.agalactiae <SEQ ID 7139> which encodes the amino 
acid sequence <SEQ ID 7140>. Analysis of this protein sequence reveals Ihe following: 

Possible site: 15 

»> Seems to have no N-terminal signal sequence 

15 

Final Results 

bacterial cytoplasm Certainty=0 . 1919 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside — Certainty=0 . 0000 (Not Clear) < suco 

20 

The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be usefiil antigens for 
vaccines or diagnostics. 

25 Example 2326 

A DNA sequence (GBSx2474) was identified in S.agalactiae <SEQ ID 7141> which encodes the amino 
acid sequence <SEQ ID 7142>. Analysis of this protein sequence reveals the following: 
Possible site: 39 

»> Seems to have no N-terminal signal sequence 

30 

Final Results 

bacterial cytoplasm — Certainty=0. 2978 (Affirmative) < suco 
bacterial membrane — 7 Certainty=0. 0000 (Not Clear) < suco 
bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

35 

The protein has no significant homology with any sequences in the GENPEPT database. 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be usefiil antigens for 
vaccines or diagnostics. 

40 Example 2327 

A DNA sequence (GBSx2476) was identified in S.agalactiae <SEQ ID 7143> which encodes the amino 
acid sequence <SEQ ID 7144>. Analysis of this protein sequence reveals the following: 

Possible site: 61 

>» Seems to have no N-terminal signal sequence 

45 

Final Results 

bacterial cytoplasm — Certainty=0. 5402 (Affirmative) < suco 
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bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 

5 No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2328 

A DNA sequence (GBSx2477) was identified in S.agalactiae <SEQ ID 7145> which encodes the amino 
10 acid sequence <SEQ ID 7146>. This protein is predicted to be mercuric reductase. Analysis of this protein 
sequence reveals the following: 

Possible site: 49 

»> Seems to have no N-terminal signal sequence 

15 Final Results 

bacterial cytoplasm Certainty=0 . 2755 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0, 0000 (Not Clear) < suco 

20 The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAA70224 GB:Y09024 mercuric reductase [Bacillus cereus] 
Identities = 190/247 (76%) , Positives = 225/247 (90%) 

Query: 1 MELGQLFHHLGSEITLMQRSERIiIjKEYDPEISESVEKAIiIEQGIlILVKGft.TFERVEQSGE 60 
25 MELGQLEH+U3SE+TL+QRSERLrjKETOPEISESVEK+L+EQGINLVKjGa.T+ER+EQ+G+ 

Sbjct: 262 MELGQLPHNIfiSEOTlilQRSERLLKEYDPEISESVEKSLVEQGimjVI^^ 321 

Query: 61 IKRVTVTWGSREVIESDQLLVATGRKPlSrrDSU5LSAAGVETGKl<MEILIlTOFGQTSNEK 120 
IK+V+V VNG + +IE+DQLLVATGR PNT +tNL AAGVE G EI+I+D+ +T+N + 
30 Sbjct: 322 IKroraVEVN6KKRIIERDQLLVATGRTPOTATIJ^^:JRARlGVEIGSRGEIIIDDYSRTT^ 381 

Query: 121 IYAAGDVTLGPQFVYVAAYEGGIITDNAIGGLNKKIDLSWPAVTFTNPTVATVGIjTEEQ 180 

IYAAGDVTLGPQFVYVAAY+GG+ NAIGGLNKK++L WP VTFT P +ATVGLTE+Q 
Sbjct: 382 lYARGDOTIfiPQFVYVAAYQGGVAAPNAIGGimKIiNLEVVPGVTFTAPAIATTOLTEQQ 441 

35 

Query: 181 AKEKGTOVKTSVLPIXSAVPRAIVNRETTGVFKLVADAETLKVLGVHIVSEl^^ 240 

AKE GY+VKTSVLPL AVPRA+VNRETTGVFKLVAD++T+KVLG H+V+ENAGDVIYAA+ 
Sbjct: 442 AKENGYEVKTSVLPIJDAVPRALVNRETTGVFKLVADSKTMKVLGAHVVAENAGDVIYAAT 501 

40 Query: 241 lAVKFGL 247 

LAVKFGL 
Sbjct: 502 LAVKFGL 508 

There is also homology to SEQ ID 1 820. 

45 Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2329 

A DNA sequence (GBSx2478) was identified in S.agalactiae <SEQ ID 7147> which encodes the amino 
acid sequence <SEQ ID 7148>. Analysis of this protein sequence reveals the following: 

50 Possible site: 30 

»> Seems to have no N-terminal signal sequence 



Final Results 
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bacterial cytoplasm Gertainty=0 . 3642 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

5 The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2330 

10 A DNA sequence (GBSx2479) was identified in S.agalactiae <SEQ ID 7149> which encodes the amino 
acid sequence <SEQ ID 7150>. This protein is predicted to be surface protein Rib. Analysis of this protein 
sequence reveals the following: 



15 



20 



30 



35 



Possible site: 61 

>>> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 1936 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0.0000(Not Clear) < suco 



No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2331 

25 A DNA sequence (GBSx2480) was identified in S.agalactiae <SEQ ID 7151> which encodes the amino 
acid sequence <SEQ ID 7152>. This protein is predicted to be Nra. Analysis of this protein sequence 
reveals the following: 



Possible site: 36 

>» Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0 . 1510 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9383> which encodes amino acid sequence <SEQ ID 9384> 
was also identified. 

The protein has no significant homology with any sequences in the GENPEPT database. 

A related DNA sequence was identified in S.pyogenes <SEQ ID 7153> which encodes the amino acid 
40 sequence <SEQ ID 7154>. Analysis of this protein sequence reveals the following: 

Possible site: 16 
>» Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -0.64 Transmembrane 22 - 38 ( 22 - 38) 

45 Final Results 

bacterial membrane CertaintysO . 1256 (Affirmative)- < succi. 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm CertaintyisO. 0000 (Not Clear) < suco 



wo 02/34771 



-2595- 



PCT/GBOl/04789 



An alignment of the GAS and GBS proteins is shown below. 

Identities = 42/157 (26%) , Positives = 78/157 (48%) , Gaps = 2/157 (1%) 

5 Query: 71 LLGREFIDSQHFKDINAYFLRHFICYCYYFIPDFyFLNTSRLSY--SKDLYHLLDKGIiAD 128 

LLG ++S FK I F R FI +PD + + R +K Y+ L + + 

Sbjct: 8 LIfiNNIIiNSLPFKRILVSFSRIiFISNLQVLLPDIHLFHYLRRQQKRNKSFYNTI.KTIVEE 67 

Query: 129 lEKrCjKGGNLTFSKHETVLLTMQLSNLIETFLT^LSVYVISSSNIRLQTYQ™ 188 
10 + +G + +L T+QL L++T+L P+ VY+++++ L Ii+ YF 

Sbjct: 68 WMSAEGIVGKLPSYHLLLFTIQLEELLKTYLPPIPVYLLTlOTAaiDLMTNMiSIYFPPA 127 

Query: 189 lAEFFFVNYQTTQIDEKLLKKADIIIAERRYISSLKN 225 
lA VN + + + +K +IIA+R+Y++ +++ 

15 Sbjct: 128 lATVMPVNVEIIPFKDIVKEKQSVIIADRQYLNLIQH 164 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2332 

20 A DNA sequence (GBSx2481) was identified in S.agalactiae <SEQ ID 7155> which encodes the amino 
acid sequence <SEQ ID 7156>. Analysis of this protein sequence reveals the following: 

Possible site: 20 

>» Seems to have no N-terminal signal sequence 

25 Final Results 

bacterial cytoplasm — Certainty=0. 1383 (Affirmative) < suco 

bacterial membrane Ce3:tainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certaintyi=o. GOOD (Not Clear) < suco 

30 The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be usefiil antigens for 
vaccines or diagnostics. 

Example 2333 

35 A DNA sequence (GBSx2482) was identified in S.agalactiae <SEQ ID 7157> which encodes the amino 
acid sequence <SEQ ID 7158>. Analysis of this protein sequence reveals the following: 

Possible site: 60 

»> Seems to have no N-tertninal signal sequence 



40 Final Results 

bacterial cytoplasm Certaintyi=0. 4145 (Affirmative) < suoo 

bacterial membrane — Certainty=0.0000 (Not Clear) < suco 

bacterial outside Certainty=0. GOOD (Not Clear) < suco 

45 The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 
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Example 2334 

A DNA sequence (GBSx2484) was identified in S.agalactiae <SEQ ID 7159> which encodes the amino 
acid sequence <SEQ ID 7160>. Analysis of this protein sequence reveals the following: 

Possible site: 57 
5 »> Seems to have no N-terminal signal sequence 

INTEGRJUj Likelihood = -2.02 Transmembrane 34 - 50 ( 34 - 50) 

Final Results 

bacterial membrane CertaiiitY=0 . 1808 (Affirmative) < suco 

10 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm CertaintysO . 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 

No corresponding DNA sequence was identified in S.pyogenes. 

15 Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2335 

A DNA sequence (GBSx2485) was identified in S.agalactiae <SEQ ID 7161> which encodes the amino 
acid sequence <SEQ ID 7162>. Analysis of this protein sequence reveals the following: 

20 Possible site: 49 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 3488 (Affirmative) < suco 

25 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

The protein has homolog}' with the following sequences in the GENPEPT database. 

>GP:CAB52002 GB:AL109663 hypothetical protein [Streptomyces 
30 coelicolor A3 (2) ] 

Identities = 61/141 (43%) , Positives = 86/141 (60%) , Gaps = 2/141 (1%) 

Query: 3 TYFDNFLKTNQAYADLHGTAHLPIKPKTKVAIVTCMDSRLHVAQALGLALGDAHILRNAG 62 
T D ++ N+ YA + +P +VA+V CMD+RL + ALGL LGD H +RNAG 

35 Sbjct: 5 TVTDRLVEftNERYAaAFADPG^OU^PVQRVaVVACMDftRLDIJ^AAMLKI^ 64 



40 



C3uery: 63 GRVTDDVLRSLVISQQQLGTRErNAn^HHTDCGAQTFTNEAFAftQLQRDLGVDMHGHDPLP 122 

G VTDDV+RSL ISQ+ LGTR + ++HHT CG +T T E F L+ ++G 
Sbjct: 65 GVVTDDVIRSLTISQRALGTRSVALIHHTGCGMETITEE-FRHDLELKVG-QRPAWAVEA 122 

Query: 123 FNDIEESVREDVAKLHASPFL 143 

F D ++ VR+ + ++ SPFL 
Sbjct: 123 FRDADQDVRQSIERVRTSPFL 143 

45 A related DNA sequence was identified in S.pyogenes <SEQ ID 6469> which encodes the amino acid 
sequence <SEQ ID 6470>. Analysis of this protein sequence reveals the following: 

Possible site: 20 

»> Seems to have no N-terminal signal sequence 

50 Final Results 

bacterial cytoplasm Certainty=0. 2295 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 



55 An alignment of the GAS and GBS protems is shown below. 
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Identities = 109/146 (74%) , Positives = 128/146 (87%) 

Query: 1 MTTYFDNFLKTNQAYADLHGTAHLPIKPKTKVAIVTCMDSRLHVAQALGLALGDAHILRN 60 

+ +YF++F+ .NQAY LHGTAHLP+KPKTKVAIVTCMDSRLHVAQALGLALGDAHILRN 
Sbjct: 1 IMSYFEHFMAANQAYVALHGTAHLPLKPKTKVAIOTCMJSRmVAQALGLALG 60 

Query: 61 AGGRVTDDVLRSLVISQQQLGTREIVVLHHTDCGAQTFTOEaPAAQLQRDLGVDMHGHDF 120 

AGGRVT+D++RSLVISQQQ+GTREIVVLHHTDCGAC2TFTNE FA + LGVD+ G DF 
Sbjct: 61 AGGRVTEDMIRSLVISQQQMGTREIVVLHHTDCGAQTFTIffiGPAKHIHEHLGVDVSGQDP 120 

Query: 121 LPFNDIEESVREDVAKLHASPFLREE 146 

LPF D+E+SVRED+AK+ AS + ++ 
Sbjct: 121 LPFQDVEDSVREDMAKIRASSLISDD 146 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2336 

A DNA sequence (GBSx2486) was identified in S.agalactiae <SEQ ID 7163> which encodes the amino 
acid sequence <SEQ ID 7164>. Analysis of this protein sequence reveals the following: 

Possible site: 26 

>» Seems to have no N-terminal signal sequence 



The protein has homology with the following sequences in the GENPEPT database. 

>GP:AaG08811 GB:AE004955 phosphoribosylaminoimidazole carboxylase, 
catalytic subunit [Pseudomonas aeruginosa] 
Identities = 20/27 (74%) , Positives = 26/27 (96%) 

Query: 1 MFKHAEEARGRGIKIIIAGAGGAAHLP 27 

+F++AEEA GRG+++IIAGAGGAaHLP 
Sbjct: 46 LFQYAEEAE6RGLEVIIAQAGGAAHLP 72 

There is also homology to SEQ ID 910: 

Identities = 27/27 (100%) , Positives = 27/27 (100%) 

Query: 1 MFKHAEEARGRGIKIIIAGAGGAAHLP 27 

MFKHAEEARGRGIKIIIAGAGGAAHLP 
Sbjct: 87 MFKHAEEARGRGIKIIIAGAGGftaHLP 113 

Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 

Example 2337 

A DNA sequence (GBSx2488) was identified in S.agalactiae <SEQ ID 7165> which encodes the amino 
acid sequence <SEQ ID 7166>. Analysis of this protein sequence reveals the following: 

Possible site: 43 

»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -6.85 Transmembrane 58 - 74 ( 53 - 80) 
INTEGRAL Likelihood = -5.79 Transmembrane 103 - 119 ( 101 - 122) 



Final Results- 



bacterial cytoplasm — Certaintyi=0. 0932 (Affirmative) < suco 
bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside — Certainty= 0.0000 (Not Clear) < suco 



Final Results 



bacterial membrane 
bacterial outside 



- Certainty=0 .3739 (Affirmative) < suco 

- Certainty=0. 0000 (Not Clear) < suco 
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bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco ^ 

There is also homology to SEQ IDs 880 and 9278. 

Based on this analysis, it was predicted that this protein and its epitopes, could be usefbl antigens for 
5 vaccines or diagnostics. 

Example 2338 

A DNA sequence (GBSx2489) was identified in S.agalactiae <SEQ ID 7167> which encodes the amino 

acid sequence <SEQ ID 7168>. This protein is predicted to be short chain alcohol dehydrogenase. Analysis 

of this protein sequence reveals the following: 

10 Possible site: 16 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 1742 (Affirmative) < suco 

15 bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9357> which encodes amino acid sequence <SEQ ID 9358> 
was also identified. 

20 The protein has homology with the following sequences in the GENPEPT database. 

>GP:ARD06605 GB:AE001530 putative oxidoreductase [Helicobacter 

pylori J99] 

Identities = S8/94 (72%) , Positives = 79/94 (83%) 

25 Query: 4 IDLLVNNAGIALGLDKSYEMJFGDWMTMIimJVVGLIYLTRCILPKMVE 63 

ID L+NNaGLALGL+K+VE + DW MI+TO+ GL++LTR ILP M+E ++G IINLGS 
Sbjct: 76 IDAIiINNaGLALGIJSIKAYECELDDWEVMIDTNIKGr.LHLTRLILPSM^ 135 

Query: 64 XAGTIPYPGANVYGASKAFVKQFSLNLRADLAGT 97 
30 AGT yPG NVYGASKAFVKQFSLNLRADLAGT 

Sbjct: 136 lAGTYAYPGGNVYGASKAFVKQFSIJilliRADLAGT 169 

A related DNA sequence was identified in S.pyogenes <SEQ ID 7169> which encodes the amino acid 
sequence <SEQ ID 7170>. Analysis of this protein sequence reveals the following: 

35 Possible site: 18 

»> Seems to have an uncleavable N-term signal seq 

Final Results 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

40 bacterial outside Certainty=0. 0000 {Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

A related sequence was also identified in GAS <SEQ ID 9121> which encodes the amino acid sequence 
<SEQ ID 9122>. Analysis of this protein sequence reveals the following: 

45 Possible site: 12 

»> Seems to have an uncleavable N-term signal seq 

Pinal Results 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

50 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 



An alignment of the GAS and GBS proteins is shown below. 
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10 



Identities = 78/96 (81%) , Positives = 87/96 (90%) 

Query: 2 QSIDLLVmAGLALGLDKSYEADFGDWMTMIOTWVGLIYLTRCILPKMVEVNRGLIINL 61 

Q I +LVNNAGLALGLDK+YEADF +VJMTMIimi+VGLIYLTR +LP MV + G+IINL 
Sbjct: 82 QDITIIjVHNAGIALGI£IKAYE2\DFENWMTMimNIVGLIYLTRQIiLPHMVSK^ 141 

Query: 62 GSXAGTlPYPGJaiJVyGRSKAFVKQFSiaaiRaDIAGT 97 

GS AGTIPYPGftN+yGASKaFVKQFSIJ!ILRADLftG+ 
Sbjct: 142 GSTAGTIPyPGaNIYGRSKAFVKQFSiasiIiRADLAGS X77 

Based on this analysis, it was predicted that these proteins and their epitopes coxild be useful antigens for 
vaccines or diagnostics. 

Example 2339 

A DNA sequence (GBSx2492) was identified in S.agalactiae <SEQ ID 7171> which encodes the amino 
15 acid sequence <SEQ ID 7172>. This protein is predicted to be mercuric reductase. Analysis of this protein 
sequence reveals the following: 

Possible site: 53 

>» Seems to have no N-terminal signal sequence 

20 Pinal Results 

bacterial cytoplasm Certainty=0 . 2115 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

25 The protein has homology with the following sequences in the GENPEPT database. 

>GP:Caci4663 GB:Y10855 mercuric reductase [Bacillus licheniformis] 
Identities = 68/104 (65%) , Positives = 82/104 (78%) 

C3uery: 1 MlJKFKVNISGMTCTGCEKHVESALEKlGaKNIESSYRRGEAVFELPDDIEVESAIKAIDE 60 
30 M K++VN+ GMTCTGCE+HV ALE +GAK IE YRRGEAVFELP+ +EVE+A KAI E 

Sbjct: 1 MKKYRVOTQGMTCTGCEEHVAVALEMGAKRIEVDYiyiGEAVFELENGLEVETAKKai^ 60 

Query: 51 AMYQAGEIEEVSSLENVALINEDNYDLLIIGSGRAAFSSAIKAI 104 
A YQ GE EEV S E + L +E +YD +IIGSG AAFSSAI+A+ 
35 Sbjct: 61 AKYQPGEAEEVQSQEIilQLGDEGDYDYlIIGSGGAAFSSAIEAV 104 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be usefid antigens for 
vaccines or diagnostics. 

40 Example 2340 

A DNA sequence (GBSx2494) was identified in S.agalactiae <SEQ ID 7173> which encodes the amino 
acid sequence <SEQ ID 7174>. Analysis of this protein sequence reveals the following: 



45 



50 



Possible site; 58 

»> Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm — Certainty=0. 3341 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 
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Based on this analysis, it was predicted that this protein and its epitopes, could be useftil antigens for 
vaccines or diagnostics. 

Example 2341 

A DNA sequence (GBSx2495) was identified in S.agalactiae <SEQ ID 7175> which encodes the amino 
5 acid sequence <SEQ ID 7176>. Analysis of this protein sequence reveals the following: 

Possible site: 31 

»> Seems to have no N-terminal signal sequence 

Final Results 

10 bacterial cytoplasm — Certainty=0. 4989 (Affirmative) < suco 

bacterial membranfi — Certainty=0.0000(Not Cleeir) < suco 
bacterial outside — Certaintyi=0. 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 

15 No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2342 

A DNA sequence (GBSx2496) was identified in S.agalactiae <SEQ ID 1\11> which encodes the amino 
20 acid sequence <SEQ ID 7178>. Analysis of this protein sequence reveals the following: 

Possible site: 30 

»> Seems to have no N-terminal signal sequence 

Final Results 

25 bacterial cytoplasm Certainty=0. 2569 (Affirmative) < suco 

bacterial membrane — Certainty= 0.0000 (Not Clear) < suco 

bacterial outside — Certaintys=0 . 0000 (Not Clear) < suco 

Based on ttiis analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
30 vaccmes or diagnostics. 

Example 2343 

A DNA sequence (GBSx2497) was identified in S.agalactiae <SEQ ID 7179> which encodes the amino 
acid sequence <SEQ ID 7180>. This protein is predicted to be DNA polymerase III alpha subunit (dnaE). 
Analysis of this protein sequence reveals the following: 

35 Possible site: 60 

»> Seems to have no N-terminal signal sequence 

Pinal Results 

bacterial cytoplasm Certainty=0 . 3124 (Affirmative) < suco 

40 bacterial membrane Certainty=0 .0000 (Not Clear) < suco 

bacterial outside — Certainty=0.0000(Not Clear) < suco 

A related DNA sequence was identified in S.pyogenes <SEQ ID 4095> which encodes the amino acid 
sequence <SEQ ID 4096>. Analysis of this protein sequence reveals the following: 

45 Possible site: 36 

»> Seems to have no N-terminal signal sequence 

Final Results 
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bacterial cytoplasm.--- Certainty=0 . 2600 (Affirmative) <.succ> 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0.0000(Not Clear) < succ> 

5 An alignment of the GAS and GBS proteins is shown below. 

Identities = 186/237 (78%) , Positives = 214/237 (89%) 

Query: 10 DET^KHSffjIFERPrOTERYSMPDIDIDLPDITOGEFLRYVRmyGSirasaQIVTFSTFGAK 69 
DPV+H+L+FERPLN+ERYSMPDIDIDnPDIVR EFLRYVRNRYGS HSAQIVTFSTFG K 
10 Sbjct: 321 DPVQHDLLFERFLMKERYSMPDIDIDLPDIYRSEFLRYVRHRYGSDHSAQIVTFSTFGPK 380 

Query: 70 QAIRDVFKRFGASEYELTNITKKIHFRDNLTSVYNKNIAFRQIIDSKIEYQKAYDIAKRI 129 

QAIRDVFKRFG EYELTN+TKKI F+D+L +VY ++++FRQ+I+S+ E+QKA+ lAKRI 
Sbjct: 381 QAIRDVFKRFGVPEYELTNLTKKIGFKDSIATVYEKSISFRQVINSRTEFQKAFAIAKRI 440 

15 

Query: 130 EGNPRQTSIHftAGVVMSDDLLTDHIPIiKNGEDmiTQYDASSVEDNGLLKMDFLGriRMLT 189 

EGNPRQTSIHRAG+VMSDD LT+HIPLK+G+DMMITQYDA. +VE NGLLKMDFLGLRNLT 
Sbjct: 441 EGNPRQTSIHaflGIVMSDDaLT^fflIPLKSGDDmITQYI2aHAVEftNGLrlKMDFLGLR^ 500 

20 Query: 190 FVQKMKEKVDKDYGISIQLETIDLEDKETLKLFAAGQTKGIFQFEQSGAINLLRRIR 246 

FVQKM+EKV KDYG I + IDLED +TL LFA G TKGIFQFEQ+GAINLL+RI+ 
Sbjct: 501 FVQKMQEKSZAKDYGCQIDITAlDLEDPQTLlU^FAKGDTKGIFQFEQNGaiNLLK^^ 557 

Based on this analysis, it was predicted liiat these proteins and their epitopes could be useful antigens for 
25 vaccines or diagnostics. 

Example 2344 

A DNA sequence (GBSx2498) was identified in S.agalactiae <SEQ ID 7181> which encodes the amino 
acid sequence <SEQ ID 7182>. This protein is predicted to be a methylase. Analysis of this protein 
sequence reveals the following: 

30 Possible site: 60 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2121 (Affirmative) < suco 

35 bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0.0000(Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAG21729 GB:AF116907 putative methylase [Corynebacterium hoagii] 
40 Identities = 48/160 (30%) , Positives = 85/160 (53%) , Gaps = 6/160 (3%) 

Query: 97 EPDDSENGHOTOTLEETDNQIPEEEVVETIPEIPVTDFYFPEDLTDFYPKTARDKVETNI 156 

EP+ + E + + ++E +P TDF D+ P A+ +V NI 

Sbjct: 1236 EPEAPTQPEAASAAETAEPAVEQQEPRAGPQSVPATDFALGTDV--HVPSGAKaRVRaNI 1293 

Query: 157 VAIRLVKNLEVEHRNASPSEQELLAKYVGWGGLANEFFDD---YNPKFSKEREELKSLVT 213 

A RLV L+ + R A+ EQ +L1A4-+ GWG + E FD+ + +++ ER L I.+ 
Sbjct: 1294 AAaRLVLELDEQQRPATAEEQAVLAQWSGWGAVP-EVFDNRSKFLSEWADERAALLDLLG 1352 



45 



50 Query: 214 DKEYSDMKQSSLTAYYTDPSLIRQMWGIVERDGFTGWQIL 253 

+K +S ++++L A+YTDP+++ ++W V+R G +1, 
Sbjct: 1353 EKGFSQARETTLNAIIYTDPAIVGELWRAVQRAGLPDGALL 1392 

No corresponding DNA sequence was identified in S.pyogenes. 

55 Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 
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Example 2345 

A DNA sequence (GBSx2499) was identified in S.agalactiae <SEQ ID 7183> which encodes the amino 
acid sequence <SEQ ID 7184>. Analysis of this protein sequence reveals the following: 

Possible site: 34 
5 >» Seems to have no N-ternunal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 1111 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

10 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
15 vaccines or diagnostics. 

Example 2346 

A DNA sequence (GBSx2501) was identified in S.agalactiae <SEQ ID 7185> which encodes the amino 
acid sequence <SEQ ID 7186>. Analysis of this protein sequence reveals the following: 

Possible site: 39 
20 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certaiiity=0 .4752 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
25 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CaA61516 GB:X89232 DNA-directed RNA polymerase [Pediococcus 
acidilactici] 

30 Identities = 48/53 (90%) , Positives = 52/53 (97%) 

Query: 5 KKPETINYRTLKPEREGLFDEVIFGPTKDWECACGKyKRIRYKGIICDRCGVE 57 

KKPETINYRTIjKPE++6LFDE IFGPTKD+ECACGKYKRIR'YKGI+CDRCGVE 
Sbjct: 29 KKPETINYRTLKPEKDGLFDERIFGPTKDYECACGKYKRIRYRGIVCDRCGVE 81 

35 

There is also homology to SEQ ID 384. 

Based on this analysis, it was predicted that these proteins and their epitopes could be usefiil antigens for 
vaccines or diagnostics. 

Example 2347 

40 A DNA sequence (GBSx2502) was identified in S.agalactiae <SEQ ID 7187> which encodes the amino 
acid sequence <SEQ ID 7188>. Analysis of this protein sequence reveals the following: 

Possible site: 22 

»> Seems to have no N-terminal signal sequence 

45 Final Results 

bacterial cytoplasm Certainty=0. 3080 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0 . 0000 (Not Clear) < suco 



50 The protein has homology with the following sequences in the GENPEPT database. 
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>6P:AAC00282 GB:AF008220 YtlR [Bacillus subtilis] 
Identities = 61/216 (28%) , Positives = 98/216 (45%) , Gaps = 28/216 (12%) 



Query: 



8 IPCTYYPVGSGNDFARALKIPNL KETLTAIQTERLKEINCFIYDKGLIL- - 56 

I ++ P G+ NDF+R I + K LT +T L +N F+ DK IL 

86 IELSFVPAGAYNDFSRGFSIKKIDLIQEIKKVKRPLT--RTFHLGSVN-FLQDKSQILYF 142 



Sbjct: 



Query: 



57 -NSLDLGFAAYVVWKRSNSKIKNirjSlRYRLGKITyiVIAIKSLIiHSSK ^^VQVLVE 109 

N + +GF AYV KA ++ + RL + V + S LH+S + E 

143 MNHIGIGFDAYVNKKAMEFPLRRVFLFLRLRFLVYPL SHLHASATFKPFTLRCTTE 198 



Sbjct: 



Query: 



110 GETGQQIKUTOLYFFALftNmTFGGGITIWPKASALTAELDMVYAKGHTFLKRLSILLSL 169 

ET + +D++P ++N+ ++GGG+ P A+ D+V + PLK+ +L + 

199 DETRE---FHDVWFAWSNHPFYGGGMKAAPIiANPREKTFDIVIVENQPFLKKYWLLCX^ 255 



Sbjct: 



Query: 



170 VFKRHTTSKSIKHQTFKAMTVYFPKNSLIEIDGEIV 205 
F +HT + K +T Y DGEI+ 



Sbjct: 256 AFGKHTKMDGVTMFKAKDITFYTKDKIPEHftDGEIM 291 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, coiild be useful antigens for 
vaccines or diagnostics. 

Example 2348 

A DNA sequence (GBSx2503) was identified in S.agalactiae <SEQ ID 7189> which encodes the amino 
acid sequence <SEQ ID 7190>. This protein is predicted to be protease subunit HfiC (hflC). Analysis of this 
protein sequence reveals the following: 

Possible site: 18 

>» Seems to have no N-terminal signal sequence 



The protein has homology with the following sequences in the GENPEPT database. 

>GP:AAG08326 GB:AE004907 protease subunit HflC [Pseudomanas aeruginosa] 
Identities = 182/202 (90%) , Positives = 194/202 (95%) 

Query: 1 MSQTERAVLLQFGKWQTDVKPGLHVKVPYVTvIQTOKFDGRLLTLDAPTQRFLTLEKKAViyi 60 

+ QTERAV+L+FG+W++DVKPGLH K+PYVNQVRKFD RLLTLDAPTQRFLTLKKKAVM 
Sbjct: 26 VQQTERAVMLRFGRVVESDVKPGLHFKIPYVNQVRKFDARLLTLDAPTQRFLTLEKKAVM 85 

■Query: 61 VDAYAKIAKVKDAERFYTATSGLKQIADERLSRRLESGLRDQFGKRTLHEVVSGERDALm 120 

VDAYAKWRV nAERFYTATSGLRQIADERLSRRLE+GLRDQPGKRTLHEWSGERDALM 
Sbjct: 86 VDAYAKmVAtMRFYTATSGLKQIADERLSI«LEAGLRDQFGKRTiaEVVSGERDALMG 145 

Query: 121 DITGSLNRl^KELGIEVLDVRVKAIDIiPKEVISlRSVFERMSTEREREAREHRM 180 

DIT SIiNRMA+KELGIEV+DVRVKAIDLPKEVNRSVFERMSTEREREAREHRAKG EL E 
Sbjct: 146 DITASIMJMAQKELGIEVIDVRVKAIDLPKEVNRSVFERMSTEREREAREHRARGRELAE 205 

Query: 181 GIRADADRQRRVLLAEAYRESE 202 

GIRADADRQRRV++AEAYRESE , 
Sbjct: 206 GIRADADRQRRVIVAEAYRESE 227 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 



Final Results 



bacterial cytoplasm Certainty=0. 1809 (Affirmative) < suco 

bacterial metribrane — Certaintyi=0. 0000 (Not Clear) < suco 
bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 
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Example 2349 

A DNA sequence (GBSx2504) was identified in S.agalactiae <SEQ ID 7191> which encodes the amino 
acid sequence <SEQ ID 7192>. Analysis of this protein sequence reveals the following: 

Possible site: 24 
5 »> Seems to have an vmcleavable N-term signal eeq 

Final Results 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

10 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on tiiis analysis, it was predicted that this protein and its epitopes, could be usefiil antigens for 
15 vaccines or diagnostics. 

Example 2350 

A DNA sequence (GBSx2505) was identified in S.agalactiae <SEQ ID 7193> which encodes the amino 
acid sequence <SEQ ID 7194>. This jprotein is predicted to be ABC transporter (ATP-bmding; 
daunombicin resistance). Analysis of this protein sequence reveals the following: 

20 Possible site: 56 

»> Seems to have no N-terminal signal sequence 

Pinal Results 

bacterial cytoplasm Certainty=0. 1846 (Affirmative) < suco 

25 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CftB15892 GB:Z99123 similar to ABC transporter (ATP-binding 
30 protein) [Bacillus subtilis] 

Identities = 88/231 (38%) , Positives = 132/231 (57%) , Gaps = 13/231 (5%) 

Query: 10 CJVIGYLPDVPKFroWraQEYLQLC- - -aGLRQtncrSLPIA^ 65 
++IGYLP P FY +MTA E+L +GL++ K I ++I1E VGL + +RI Y 

35 Sbjct: 69 RLIGYIjPQYPAFYSWMTAISEFLTFAGRLSGLSKRKCQEKIGEMLEFVGLHEARHKRIGGY 128 



40 



Query: 66 SRGMKQRLGLAQALIHXXKILICDEPTSALDPQGRQEILSIISQLRGQKTVIFSTHILSD 125 

S GMKQRLGLAQAL+H K LI DEP SALDP GR E+L ++ +L+ V+FSTH+L D 
Sbjct: 129 SGG^lKQRLGLAQflLLHKPKFLILDEPVSaLDPTGRFEVLD^I^KELICKHMAVLFSTHVLHD 188 

Query: 126 VEKVCDQVLILTKSGIH---NLEDLRDKASASVNQ]mj:iIKVSDNEAQKLaiiRPPIiNQKD 182 

E+VCDQV+I+ 1 L++L+ + +V L++ K+ +K + + + 

Sbjct: 189 AEQVCDQVVIMKNGEISWIOSELQELKQQQQTNVFTLSVKEKLEGWLEEKPYVSAIVYKNP 248 

45 Query: 183 QYYKVHLELSEAlWIREQALASFyRYLVEQEITPyFIELLEDSLEDFYLEVI 233 

+ EL + + L+ + + +T E +SLED YL+V+ 

Sbjct: 249 S-'QAVFELPDIHAGRSLLSD CIRKGLTVTRFEQKTESLEDVYLKW 293 

There is also homology to SEQ ID 686. 

50 Based on this analysis, it was predicted that this protein and its epitopes, could be useftil antigens for 
vaccines or diagnostics. 
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Example 2351 

A DNA sequence (GBSx2506) was identified in S.agalactiae <SEQ ID 7195> which encodes the amino 
acid sequence <SEQ ID 7196>. Analysis of this protein sequence reveals the following: . 

Possible site: 52 
5 »> Seems to have no N-terminal signal sequence 

"Final Results 

bacterial cytoplasm Certaintyi=0 . 0679 (Affirmative) < suco 

bacterial membrane — Certainty=0 . 0000 (Not Clear) < suco 

10 bacterial outside — Certainty= 0.0000 (Not Clear) < suco 

The protein has homology with glycine-rich cell wall proteins (e.g. GB:AL161589 - the glycine-rich cell wall 
protein from Arabidopsis thalianid) and to SEQ ID 6882. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
15 vaccines or diagnostics. 

Example 2352 

A DNA sequence (GBSx2507) was identified in S.agalactiae <SEQ ID 7197> which encodes the amino 
acid sequence <SEQ ID 7198>. Analysis of this protein sequence reveals the following: 

Possible site: 35 
20 >>> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2890 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
25 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
30 vaccines or diagnostics. 

Example 2353 

A DNA sequence (GBSx2508) was identified in S.agalactiae <SEQ ID 7199> which encodes the amino 
acid sequence <SEQ ID 7200>. Analysis of this protein sequence reveals the following: 

Possible site: 60 
35 »> Seems to have no N-terminal signal sequence 

■ Final Results 

bacterial cytoplasm — Certainty=0. 2410 (Affirmative) < suco 

bacterial membrane — Certainty= 0.0000 (Not Clear) < suco 

40 bacterial outside --- Certainty's 0.0000 (Not clear) < suco 

A related GBS nucleic acid sequence <SEQ ID 9329> which encodes amino acid sequence <SEQ ID 9330> 
was also identified. 

The protein has no significant homology with any sequences in the GENPEPT database. 
45 No corresponding DNA sequence was identified in S.pyogenes. 
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SEQ ID 9330 (GBS678) was expressed in E.coli as a His-fusion product. SDS-PAGE analysis of total cell 
extract is shown in Figure 163 (lane 18; MW 53kDa), Figure 164 (lane 2 & 3; MW 53kDa) and Figure 188 
(lane 7; MW 53kDa). Purified protein is shown in Figure 242, lanes 6 & 7. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
5 vaccines or diagnostics. 

Example 2354 

A DNA sequence (GBSx2509) was identified in S.agalactiae <SEQ ID 7201> which encodes the amino 
acid sequence <SEQ ID 7202>. This protein is predicted to be surface protein Rib. Analysis of this protein 
sequence reveals the following: 

10 Possible site: 24 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 2025 (Affirmative) < suco 

15 bacterial membrane Certainty=0 . 0000 {Not Clear) < suco 

bacterial outside Certainty4=o. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be usefiil antigens for 
20 vaccines or diagnostics. 

Example 2355 

A DNA sequence (GBSx2510) was identified in S.agalactiae <SEQ ID 7203> which encodes the amino 
acid sequence <SEQ ID 7204>. This protein is predicted to be surface protein Rib. Analysis of this protein 
sequence reveals the following: 

25 Possible site: 24 

»> Seems to have no N-teirminal signal sequence 

Final Results 

bacterial cytoplasm Certaiiity=0 . 1892 (Affirmative) < suco 

30 bacterial membrane Certainty=0 . 0000 (Not Clear) < succ> 

bacterial outside Certainty=0 .0000 (Not Cleax) < suco 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
35 vaccines or diagnostics. 

Example 2356 

A DNA sequence (GBSx2511) was identified in S.agalactiae <SEQ ID 7205> which encodes the amino 
acid sequence <SEQ ID 7206>. This protein is predicted to be surface protein Rib. Analysis of this protein 
sequence reveals the following: 

40 Possible site: 24 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 1892 (Affirmative) < suco 

45 bacterial meiribrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2357 

5 A DNA sequence (GBSx2512) was identified in S.agalactiae <SEQ ID 7207> which encodes the amino 
acid sequence <SEQ ID 7208>. Analysis of this protein sequence reveals the following: 

Possible site: 28 

»> Seems to have no N- terminal signal sequence 

10 Final Results 

bacterial cytoplasm Certaintyi=0 . 0999 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

15 The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useM antigens for 
vaccines or diagnostics. 

Example 2358 

20 A DNA sequence (GBSx2514) was identified in S.agalactiae <SEQ ID 7209> which encodes the amino 
acid sequence <SEQ ID 7210>. This protein is predicted to be surface protein Rib. Analysis of this protein 
sequence reveals the following: 

Possible site: 24 

>» Seems to laave no N- terminal signal sequence 

25 

Final Results 

bacterial cytoplasm Certainty=0 . 1892 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

30 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2359 

35 A DNA sequence (GBSx2515) was identified in S.agalactiae <SEQ ID 7211> which encodes the amino 
acid sequence <SEQ ID 7212>. Analysis of this protein sequence reveals the following: 
Possible site: 19 

»> Seems to have no N-terminal signal sequence 

40 Final Results 

bacterial cytoplasm — Certainty=0. 2041 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

45 The protein has no significant homology with any sequences in the GENPEPT database. 
No corresponding DNA sequence was identified in S.pyogenes. 
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Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
vaccines or diagnostics. 

Example 2360 

A DNA sequence (GBSx2516) was identified in S.agalactiae <SEQ ID 7213> which encodes the amino 
5 I acid sequence <SEQ ID 7214>. This protein is predicted to be 30S ribosomal protein S6 (rpsF). Analysis of 
this protein sequence reveals the following: 

Possible site; 51 

»> Seems to have no N-tertninal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0, 3 607 (Affirmative) < suco 

bacterial membrane Certainty=0, 0000 (Not Clear) < suco 

bacterial outside Certainty4=0 . 0000 (Not Clear) < suco 

15 A related GBS nucleic acid sequence <SEQ ID 9423> which encodes amino acid sequence <SEQ ID 9424> 
was also identified. 

The protein has homology with the following sequences in the GENPEPT database. 

>GP:CAB16128 QB:Z99124 ribosomal protein SS (BS9) [Bacillus subtilis] 
Identities = 41/72 (56%) , Positives = 58/72 (79%) , Gaps = 1/72 (1%) 

20 

Query: 1 MVRRFDSILSDNGftTVVESKDWEKRRIJffElQDFTEGLYHIVMVEfiEDRVMJffiFDRL^^ 60 

++ RF+++L+ NGA + +KDW KRRLAYEI DF +G Y IVNV++ DA A+ EFDRL+K 
Sbjct: 22 VIERFNNVLTSNGAEITGTKDWGKRRLAYEINDFRDGFYQIVNVQS-DAAAVQEFDRIiAK 80 

25 Query: 61 INGDILRHMIVK 72 

1+ DI+RH++VK 
Sbjct: 81 ISDDIIRHIWK 92 

A related DNA sequence was identified in S.pyogenes <SEQ ID 721 5> which encodes the amino acid 
30 sequence <SEQ ID 7216>. Analysis of this protein sequence reveals the following: 

Possible site: 40 

>» Seems to have no N-teminal signal sequence 

Final Results 

35 bacterial cytoplasm Certaintyi=0 .2720 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

An alignment of the GAS and GBS proteins is shown below. 

40 Identities = 66/74 (89%) , Positives = 70/74 (94%) 

Query: 1 MVARFDSILSDNGATVVESKDWEKRRIiAyEIQDFTEGLYHIVNVEAEDAVALNEFDRLSK 60 

+VARFDSI1>+DNGATWESKDWEKRRI>AYEI DF EGLYHIVN+EA DA ALNEFDRLSK 
Sbjct: 22 LVRRFDSILTDNGATVVESKDWEKRIUIiAVEIiroEREGLYHIVNLEATDAaAIJSEFDRL 81 

45 

Query: 61 INGDILRHMIVKVD 74 

IN6DItRHMIVK+D 
Sbjct: 82 INGDILRHMIVKLD 95 

50 Based on this analysis, it was predicted that these proteins and their epitopes could be useful antigens for 
vaccines or diagnostics. 
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Example 2361 

A DNA sequence (GBSx2518) was identified in S.agalactiae <SEQ ID 7219> which encodes the amino 
acid sequence <SEQ ID 7220>. This protein is predicted to be surface protein Rib. Analysis of this protein 
sequence reveals the following: 

5 Possible site: 49 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — CertaintY=0. 5289 (Affirmative) < suco 

10 bacterial membrans — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.pyogenes. 

Based on this analysis, it was predicted that this protein and its epitopes, could be useful antigens for 
15 vaccines or diagnostics. 

Example 2362 

A DNA sequence (GASxlR) was identified in S.pyogenes <SEQ ID 7221> which encodes the amino acid 
sequence <SEQ ID 7222>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

20 

»> Seems to have an uncleavable N-term signal seg 

Final Results 

bacterial membrane — Certaintyi=0. 0000 (Not Clear) < suco 

25 bacterial outside — Certainty=0.0000(Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence \yas identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

30 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2363 

A DNA sequence (GASxSR) was identified in S.pyogenes <SEQ ID 7223> which encodes the amino acid 
sequence <SEQ ID 7224>. Analysis of this protein sequence reveals the following: 

35 Possible site: 20 

»> Seems to have an uncleavable N-term signal seq 

Final Results 

40 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintyi=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

45 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefid 
antigens for vaccines or diagnostics. 
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Example 2364 

A DNA sequence (GASxll) was identified in S.pyogenes <SEQ ID 7225> which encodes the amino acid 
sequence <SEQ ID 7226>. Analysis of this protein sequence reveals the following: 
Possible site: 22 

5 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 2614 (Affirmative) < suco 

10 bacterial membrane --- Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

15 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefial 
antigens for vaccines or diagnostics. 

Example 2365 

A DNA sequence (GASxl7) was identified in S.pyogenes <SEQ ID 7227> which encodes the amino acid 
sequence <SEQ ID 7228>. Analysis of this protein sequence reveals the following: 

20 Possible site: 30 

»> Seems to have no N-terminal signal sequence 

Final Results 

25 bacterial cytoplasm --- Certainty=0. 2849 (Affirmative) < suco 

bacterial membrane — Certainty=0 . 0000 (Not Clear) < suco 
bacterial outside — Certainty=0 . 0000 (Not clear) < suco 

No corresponding DNA sequence was identified in S. agalacttae. 

30 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefial 
antigens for vaccines or diagnostics. 

Example 2366 

A DNA sequence (GASxlS) was identified in S.pyogenes <SEQ ID 7229> which encodes the amino acid 
35 sequence <SEQ ID 7230>. Analysis of this protein sequence reveals the following: 

Possible site: 30 

»> Seems to have no N-terminal signal sequence 

40 Final Results 

bacterial cytoplasm Certainty=0 .2099 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0.0000 (Not Clear) < suco 

45 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2367 

A DNA sequence (GASx34) was identified in S.pyogenes <SEQ ID 723 1> which encodes the amino acid 
5 sequence <SEQ ID 7232>. Analysis of this protein sequence reveals the following: 

Possible site: 54 

»> Seems to have no N-terminal signal sequence 

10 FilKl Results 

bacterial cytoplasm Certaintyi=0. 0801 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside Certaintyi=0. 0000 (Not Clear) < succ> 

15 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefijl 
antigens for vaccines or diagnostics. 

Example 2368 

20 A DNA sequence (GASx38) was identified in S.pyogenes <SEQ ID 7233> which encodes the amino acid 
sequence <SEQ ID 7234>. Analysis of this protein sequence reveals the following: 

Possible site: 18 



25 



30 



>» Seems to have an uncleavable N-term signal seq 



Final Results 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintyi=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CRB12617 GB:Z99108 similar to protein- tyrosine phosphatase 
[Bacillus stibtilis] 

35 Identities = 57/155 (36%) , Positives = 88/155 (56%) , Gaps = 12/155 (7%) 

Query: 1 MKKVCFVCLGNICRSPMAEFVMKSIVS SDVMMIESRATSDWEHGNPIHSGTQSILK 56 

M V FVCLGNICRSPKKE + + + + + +S W GNP H GTQ IL+ 

Sbjct: 1 MISVLFVCLGNICRSPMREAIFRDLaAKRGLEGKIKaDSAGIGGWHIGNPPHEGTQEILR 60 

40 

Query: 57 TYQINyDITKCSKQITITDFNTEDYIIGMDSDNVKNLKEMSQHQWDSKIYLFRE 110 

I++D ++Q++ D + FDYII MD++N+ +L+ M+ + S I + 
Sbjct: 61 REGISFD-GMIJ^QVSEQDH)DFDYIIAMDAENIGSLRSMA.GFKNTSHIKRLLDYVEDSD 119 

45 Query: 111 -GGVPDPWYTNDFEETYQLVRKGCQDWLSRLMSKE 144 

VPDP+YT +PEE QL++ GC+ L+ + ++ 
Sbjct: 120 LADVPDPYyTGNFEEVCQLIKTGCEQLLASIQKEK 154 



Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiol 
50 antigens for vaccines or diagnostics. 
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Example 2369 

A DNA sequence (GASx42R) was identified in S.pyogenes <SEQ ID 7235> which encodes the amino acid 
sequence <SEQ ID 7236>. Analysis of this protein sequence reveals the following: 

Possible site: 14 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 4753 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. GOOD (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2370 

A DNA sequence (GASx47R) was identified in S.pyogenes <SEQ ID 7237> which encodes the amino acid 
sequence <SEQ ID 7238>. Analysis of this protein sequence reveals the following: 

Possible site: 58 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 2014 (Affirmative) < suco 

bacterial membrane — Certainty=0. GOOD (Not Clear) < suco 

bacterial outside Certainty=0 . GGGO (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2371 

A DNA sequence (GASx53R) was identified in S.pyogenes <SEQ ID 7239> which encodes the amino acid 
sequence <SEQ ID 7240>. Analysis of this protein sequence reveals the following: 

Possible site: 45 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -G.ll Transmembrane 56 - 72 ( 56 - 72) 

Final Results 

bacterial membrane Certainty=0. 1044 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty^O . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2372 

A DNA sequence (GASx67R) was identified in S.pyogenes <SEQ ID 7241> which encodes the amino acid 
5 sequence <SEQ ID 7242>. Analysis of diis protein sequence reveals the following: 

Possible site: 39 

»> Seems to have no N- terminal signal sequence 

10 Pinal Results 

bacterial cytoplasm CertaintyisO . 1610 (Affirmative) < suco 

bacterial membrane Cartaintyi=0.0000(Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2373 

A DNA sequence (GASx75) was identified in S.pyogenes <SEQ ID 7243> which encodes the amino acid 
sequence <SEQ ID 7244>. Analysis of this protein sequence reveals the following: 

Possible site: 31 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 2803 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside CertaintysQ. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAA41942 GB:X59250 ribosomal protein B [Lactococcus lactis] 
Identities = 37/38 (97%) , Positives = 37/38 (97%) 

Query: 1 MKVRPSVKPIGEYCKVIRRNGRVMVICPTNPKHKQRQG 38 

MKVRPSVKPICEYCKVIRRNGRVMVICP NPKHKQRQG 
Sbjct: 1 MKVRPSVKPICEYCKVIRRNGRVMVICPANPKHKQRQG 38 

40 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2374 

A DNA sequence (GASx76) was identified in S.pyogenes <SEQ ID 7245> which encodes the amino acid 
sequence <SEQ ID 7246>. Analysis of this protein sequence reveals the following: 

45 Possible site: 35 

»> Seems to have no N-terminal signal sequence 
Final Results 



20 



25 



30 



35 
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bacterial cytoplasm Certainty=0 . 0824 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 {Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

5 No ccaresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP;ARB06824 GB:L47971 ribosomal protein S13 [Bacillus subtilis] 
Identities - 86/121 (71%) , Positives = 103/121 (85%) 

10 Query: 1 MARIAGVDIPNDKRWISLTYVYGIGLATSKKILAAAGISEDIRVKDLTSDQEDAIRREV 60 

MARIAGVDIP DKRWISLTY++GIG T++++L AG+SED RV+DLT ++ IR + 
Sbjct: 1 bffiRIAGVDIPRDKRWISLTyiFGIGRTTAQQVLKEaGVSEDTRVRDLTEEELGKIRDII 60 

Query: 61 imiKVEGDLRREVNmiKRIJffilGSXRGIRHRRGLPVRGQNTKNNaRTRKGKAVAI^ 121 
15 D +KVEGDIiRREV++NIKRL+EIGSyRGIRHRR6LPVRGQN+KNNaRTRKlG +A KKK 

Sbjct: 61 DKLKVEGDLRREVSIiNIKRLIEIGSyRGIRHRRGLPVRGQNSKNHARTRKGPRRTVaNK^ 121 

Based on this analysis, it was predicted that this GAS-spedfic protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

20 Example 2375 

A DNA sequence (GASxSlR) was identified in S. pyogenes <SEQ ID 7247> which encodes the amino acid 
sequence <SEQ ID 7248>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

25 »> Seems to have no K-terminal signal sequence 

Final Results 

bacterial cytoplasm Certalnty=0 . 1842 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

30 bacterial outside — CertaintysO. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useftil 
35 antigens for vaccines or diagnostics. 

Example 2376 

A DNA sequence (GASx82) was identified in S.pyogenes <SEQ ID 7249> which encodes the amino acid 
sequence <SEQ ID 7250>. Analysis of this protein sequence reveals the following: 

Possible site: 59 

40 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm CertaintY=0. 3613 (Affirmative) < suco 

45 bacterial membrane — Certainty^O. 0000 (Not Clear) < suco 

bacterial outside — CertaintYi=0. 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 

The protem has no significant homology with any sequences in the GENPEPT database. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2377 

A DNA sequence (GASx83) was identified in S.pyogenes <SEQ ID 725 1> which encodes the amino acid 
5 sequence <SEQ ED 7252>. Analysis of this protein sequence reveals the following: 

Possible site: 51 

>» Seems to have no N-terminal signal sequence 

10 Final Results 

bacterial cytqplasm — Certainty=0.1141(Affirraative) < suco 

bacterial membrane CertaintyfaO . 0000 (Not Clear) < suco 

bacterial outside — Ce3rtainty=0. 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2378 

20 A DNA sequence (GASx85) was identified in S.pyogenes <SEQ ID 7253> which encodes the amino acid 
sequence <SEQ ID 7254>. Analysis of this protein sequence reveals the following: 

Possible site: 16 

»> Seems to have no N-terminal signal sequence 

25 

Final Results 

bacterial cytoplasm Certainty=0 . 2280 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside Certainty^O . 0000 (Not Clear) < suco 

30 

No coiresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

35 Example 2379 

A DNA sequence (GASx89R) was identified in S.pyogenes <SEQ ID 7255> which encodes the amino acid 
sequence <SEQ ID 7256>. Analysis of this protein sequence reveals the following: 

Possible site: 44 

40 >>> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 3 040 (Affirmative) < suco 

bacterial membrane Certaiiity=0. 0000 (Not Clear) < suco 

45 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has no significant homology with any sequences in the GENPiEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2380 

5 A DNA sequence (GASxl02) was identified in S. pyogenes <SEQ ID 7257> which encodes the amino acid 
sequence <SEQ ID 7258>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

»> Seems to liave an uncleavable N-term signal seq 
10 INTEGRAL Likelihood =-13.75 Transmembrane 21 - 37 { 12 - 41) 

Final Results 

bacterial membrane Certainty=0 . 6498 (Affirmative) < suco 

bacterial outside Certainty4=0. 0000 (Not Clear) < suco 

15 bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences m the GENPEPT database: 

>GP:AAC45312 GB:U81957 ComYC [Streptococcus gordonii] 
20 Identities = 59/104 (56%) , Positives = 85/104 (81%) , Gaps = 1/104 (0%) 

Query: 6 NNLRHKKLKGFTLLEMLLVILVISVLMLLFVPNLSKQKDRVTETGNAAWKLVENQAELY 65 

N L+ ++K FTL+EML+V+L+ISVLMLLFVPNL+KQK+ V+H-TGNAAWK+VE+QAELY 
Sbjct: 2 NKLKKLRVKAFTLVEmVVLLIISVLMLLFVENLTKQKEAVSDTGNAAVVKVVESQAELY 61 

25 

Query: 66 EL-SQGSKPSLSQLKADGSITEKQEKAYQDYYDKHKNEKAELSN 108 

EL + G + +LS+L A G+I++KQ +Y+ YY K+ +E ++N 
Sbjct: 62 ELKNTGDQATLSKLVAAGNISQKQADSYKAYYGKNNSETQAVRN 105 

30 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2381 

A DNA sequence (GASxlOS) was identified in S.pyogenes <SEQ ID 7259> which encodes the amino acid 
sequence <SEQ ID 7260>. Analysis of this protein sequence reveals the following: 

35 Possible site: 24 

>>> Seems to have a cleavable N-term signal seq. 

Final Results 

40 bacterial outside Certainty=0. 3000 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintys=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
45 The protein has homology with the following sequences in the GENPEPT database: 

:?GP:AAC23740 GB:AF052207 competence protein [Streptococcus pneumoniae] 
Identities = 52/131 (39%) , Positives = 76/131 (57%) 

Query: 8 IKAFTLLETLLSLSVMSFIILGLSVPVTKSYQKVEEHLFFSHFEHLYRHQQKLAILQQKQ 67 
50 IKAFT+LE+LL L ++S + LGLS V ++ VEE +FF FE LYR QK ++ Q++ 

Sbjct: 2 IKAFTMLESLLVLGLVSILALGLSGSVQSTFSAVEEQIFFMEFEELYRETQKRSVASQQK 61 
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Query: 68 • RVLDISSTKIVTEGNSLTTOKSITVlSraPTOLVIDQMGGIfflSMJCIIFDMTDRRFKYQFYL 127 

L++ I LTVPK I + D+ GGN SLAK+ F + +YQ YL 

Sbjct: 62 TSiaiLDGQMISNGSQKLTVPKGIQAPSGQSITFDRAGGNSSLAKVEFQTSKGAIRYQLYL 121 

5 Query: 128 GSGNYQKTSQS 138 

G+G ++ ++ 
Sbjct: 122 GNGKIKRIKET 132 

Based on fliis analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
1 0 antigens for vaccines or diagnostics. 

Example 2382 

A DNA sequence (GASxl04) was identified in S. pyogenes <SEQ ID 726 1> which encodes the amino acid 
sequence <SEQ ID 7262>. Analysis of this protein sequence reveals the following: 

Possible site: 23 

15 

»> Seems to have a cleavable N-term signal seg. 

Final Results 

bacterial outside Certainty=0 . 3000 (Affirmative) < suco 

20 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0, 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

25 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2383 

A DNA sequence (GASxlOP) was identified in S.pyogenes <SEQ ID 7265> which encodes the amino acid 
sequence <SEQ ID 7266>. Analysis of this protein sequence reveals the following: 

30 Possible site: 45 

>>> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood =-10.51 Transmembrane 37 - 53 ( 28 - 58) 
INTEGRAL Likelihood = -3.56 Transmembrane 61 - 77 ( 60 - 77) 

35 

Final Results 

bacterial membrane — Certainty^O . 5203 (Affirmative) < suco 
bacterial outside — Certaintyi=0. 0000 (Not Clear) < suco 
bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

40 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

45 Example 2384 

A DNA sequence (GASxl 15R) was identified in S.pyogenes <SEQ ID 7267> which encodes the amino acid 
sequence <SEQ ID 7268>. Analysis of this protein sequence reveals the following: 

Possible site: 18 
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>» Seems to have an uncleavable N-tertn signal seq 

INTEGRAL Likelihood =-11.09 Transmembrane 20 - 36 ( 13 - 40) 



5 Final Results 

bacterial membrane — Certainty=0. 5437 (Affirmative) < suco 
bacterial outside — Certainty=0. 0000 (Not Clear) < suco 
bacterial cytoplasm — Certaxnty=0.0000 (Not Clear) < suco 

10 No corresponding DNA sequence was identified in S.agalactiae. 



The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2385 

15 A DNA sequence (GASxl24) was identified in S.pyogenes <SEQ ID 7269> which encodes the amino acid 
sequence <SEQ ID 7270>. Analysis of this protein sequence reveals the following: 

Possible site: 52 

>» Seems to have no N- terminal signal sequence 
20 INTEGRAL Likelihood = -8.17 Transmembrane 31 - 47 ( 29 - 59) 

INTEGRAL Likelihood = -5.63 Transmembrane 737 - 753 ( 734 - 756) 

Pinal Results 

bacterial membrane — Certainty=0. 4270 (Affirmative) < suco 

25 bacterial outside Certainty=0.0000(Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

30 >GP:AAC97148 GB:U49397 Cpa [Streptococcus pyogenes] 

Identities = 401/737 (54%) , Positives = 517/737 (69%) , Gaps = 25/737 (3%) 





Query: 


25 


SKNSKR--FTVTLVGVFLMIFALVTSMVGAKTVFGLVESSTPNAINPDSSSEyRWYGYES 


82 








S N+KR T+ L+ VFL AL+ + + FG E S PN S +Y WYGY+S 




35 


Sbjct: 


11 


SANNKRRQTTIGLLKVFLTFVALIGIVGPSIRAFGAEEQSVHSr--RQSSIQDYPWYGYDS 


68 




Query: 


83 


YVRGHPYYKQFRVAHDLRVNLEGSRSYQVYCENLKKAFPKSSDSSVKIOTYKKHDGISTKF 


142 








Y +G+P Y + H+L+VNLEGS+ YQ YCFNL K FP SDS +WYKK +G + F 




40 


Sb j ct : 


69 


YPKGYPDYSPLKTYHNLKVNLEGSKDYQAYCFNLTKHFPSKSDSVRSQWYKKLEGTNENF 


128 




Query: 


143 


EDYAMSPRITGDEU3QKLEAVMYNGHPC2JANGIMEGLEPLNAIRVTQEAVWYYSDNAPIS 


202 








A PRI +L Q + ++YNG+P N NGIM+G+H-PLNAI VTQ A+W Y+D+A I 






Sbjct: 


129 


IKLADKPRIEIXKJLQQNILRILYNGyPNNRNGIMKGIDPimiLVTQNAIW-YTDSAQI- 


186 


45' 


Query: 


203 


NPDESFKRESESNLVSTSQLSLMRQALKQLIDPNLATKMPKQVPDDFQLSIFESEDRGDK 


262 








NPDESFK E+ SN ++ QL LMR+ALK+LIDPNL +K + P ++L++FES D 






Sbj ct: 


187 


NPDESFKTEARSNGINDQQLGLMRKALKELIDENLGSKySNKTPSGYRLNVFESHD 


242 




Query: 


263 


YNKGYQNLLSGGLVPTKPPTPGDPPMPPNQPQTTSVLIRKYAIGDYSKLLEGATLQLTGD 


322 


50 






K +QNLLS VP PP PG+ PP + + TSV+IRKYA GD SKLLEGATL+L+ 






Sbjct: 


243 


--KPFQNLLSAEYVPDTPPKPGEE--PPAKTEKTSVIIRKYAEGD-SKI1LEGATLKLSQI 


297 




Query: 


323 


NVNSFQARVFSSNDIGERIELSDGTYTLTELNSPAGYSIAEPITFKVEAGKVYTI - IDGK 


381 








+ FQ + F SK +GE +EL +GTyTLTE +SP GY lAEPI F+VE KV+ + DG 




55 


Sbj ct : 


298 


EGSGFQEKDFQSNSLGETVELPNGTYTLTETSSPDGYKIAEPIKFRVENKKVFIVQKDGS 


357 



Query: 382 QIENPNKEIVEPYSVEAYNDFEEFSVLT-TQNYAKFYYAKNKNGSSQWYCFHRDLKSPP 440 
Q+ENPNKE+ EPYSVEAYNDF + VL+ Y KFYYA NK+ SSQWYCFNADL SPP 
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Sb j Ct : 


358 


QVENPNKEVAEPYSVEAYNDETClEEWIiSGFTPYGKFYyATNKDKSSQWYCEmDLHSPP 417 


Query: 


441 


DSEDGGKTMTPDFTT-GEVKYTHIAGRDLFKYTVKPRDTDPDTFLKHIKKVIEKGYREKG 


499 






DS D G+T+ PD +T EVKYTH AG DLFKY ++PRDT+P+ FLKHIKKVIEKGY++KG 




Sb j ct : 


418 


DSYDSGETIMPOTSTMKEVKn'HTAGSDLFKYALRPRDTNPEDFLKHIKKVIEKGYKKKG 


477 


Query: 


500 


QAIEYSGLTETQLRAATQLAIYYFTDSaELDKDKL KDYHGFGDMNDSTLAVAKILV 


555 






+ Y+GLTETQ RAATQIAIYVFTDSA+L K K YHGF M++ TLAV K L+ 




Sb j ct : 


478 


DS--YWGLTETQFIUiATQI^IYYFTDSaDLKTLKTY]!mGKGYHGFESMDEKTLAVTKELI 


535 


Query: 


556 


EYAQDSNPPQLTDLDFFIPNNNKYQSLIGTQWHPEDLVDIIRMEDKK-EVIPVTHNLTLR 


614 






YAQ+ + PQLT+LDFF+PNN+K QSIjIGT+ HP+DLVD+IRMEDKK EVIPVTH+LT++ 




Sb j ct : 


536 


TYAQNGSAPQLTNLDFFVPNNSraXSSLIGTECHPDDLVDVIRMEDKRQEVIFVTHSLTVK 595 


Query: 


615 


KTTOSLaGDRTKDFHFEIELKNNKQELLSQTVKTOKTI!!^ 


674 






KTV G GD+TK F FE+ELK+ + + T+KT+ +L KDGK + NLKHG+++ ++G 




Sb j ct : 


596 


KTWGELGDKTKGFQFELELKDKTGQPIVNTLKTNNQDLVAKDGKYSFNLKHGDTIRIEG 


655 


Query: 


675 


LPEGYSYLVKETDSEGYK«CVNSQEVANAWSKTGITSDETLAFENNKEPVVPTGVDQKI 


734 






LP GYSY +KE +++ Y V V+++ A IT D+ + FEN K+ V PTG+ 




Sbjct: 


656 


LPTGYSYTLKEAEAKDYIVTVDNKVSQEftQSVGKDITEDKKVTFElTOKDLVPPTGLTTDG 


715 


Query: 


735 


NGYLALIVIAGISLGIW 751 








YL L+++ + L +W 




Sbjct: 


716 


AIYLWIiLLVPLGLLVW 732 




Based on this 


analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 



antigens for vaccines or diagnostics. 



Example 2386 

A DNA sequence (GASxl25R) was identified in S.pyogenes <SEQ ID 7271> which encodes the amino acid 
sequence <SEQ ID 7272>. Analysis of this protein sequence reveals the following: 

Possible site: 40 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2604 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2387 

A DNA sequence (GASxl26) was identified in S.pyogenes <SEQ ID 7273> which encodes the amino acid 
sequence <SEQ ID 7274>. Analysis of this protein sequence reveals the following: 

Possible site: 14 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 1537 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AaC97149 GB:U49397 LepA [Streptococcus pyogenes] 
Identities = 59/132 (44%) , Positives = 84/132 (62%) , Gaps = 5/132 (3%) 

5 

Query: 1 MIIKRNDMAPSVKAGDAILFYRLSQTYKVEEAWYEDSKTSITIWGRIIAQAGDEVDLTE 60 

Mil NDM+P++ AGD +L+YRL+ + + WYE T KVGRI AQAGDEV+ T+ 
Sbjct: 42 MIIOTOTMSPALSAGDGVLYYRIjmRSHira3VVVYEVDOT--LKV^GRiaAQAGDEVN^ 99 

10 Query: 61 QGELKINGHIQNEG---LTFIKSREftOTPYRIADNSYLILiroYySQESEimiQi3AIAK^ 117 

+G L INGH + LT+ S N+PY++ +Y ILNDY + ++ A+ + 
Sbjct: 100 EGGLLINGHPPEKEVFYLTYPHSSGPNFPYKVPTGTYPILNDYKEERIiDSRY^ 159 

Query: 118 IKGTINTLIRLR 129 
15 IKG I+TL+R+R 

Sbjct: 160 IKGKISTLLRVR 171 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefial 
antigens for vaccines or diagnostics. 

20 Example 2388 

A DNA sequence (GASxl27) was identified in S.pyogenes <SEQ ID 7275> which encodes the amino acid 
sequence <SEQ ID 7276>. Analysis of this protein sequence reveals the foUovsring: 

Possible site: 17 

25 »> Seems to have a cleavable N-tertn signal seq. 

INTEGRAL Likelihood = -3.93 Transmembrane 312 - 328 ( 311 - 337) 

Final Results 

bacterial membrane — Certainty=0. 2572 (Affirmative) < suco 
30 bacterial outside — Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

35 >GP:AAC97152 GB:U49397 unknown [Streptococcus pyogenes] 

Identities = 125/355 (35%) , Positives = 191/355 (53%) , Gaps = 26/355 (7%) 

Query: 1 MKLRHLLLTGAALTSFA ATTVHGET--WNGAKliTVTKNL-DLVNSNaLIPNTDF 52 

MK LLL A L + + + ET V++G+ L V K + N L+P D+ 

40 , Sbjct: 1 MKlCNKLLLATAILATAIfiMB^SMSQNIKAETAGVIDGSTLVVKKTFPSYTDDNVIJlPK^ 

Query: 53 TFKIEPDTTVN- - -EDGNKFK-GVALNTPMTK-VTYTNSDKGGSNTKTAEFDFSEVTFEK 107 

+FK+E D +DG K GV TK + Y+NSDK + K+ F+F+ V F 

Sbjct: 61 SFKOTlADrmKGKTKDGIiDIKPGVIIXSLEtTrKTIRYSNSDKITAKEKSVNFEFANVKFPG 120 

Query: 108 PGVYYYKVTEEKIDKVPGVSYDTTSYTVQVHVLWNEEQQKPVATYIVGyKEGS--KVPIQ 165 

GVY Y V E +K G++YD+ +TV V+V+ N+E YIV + G K P+ 

Sbjct: 121 VGVYRYTVAEVNGNKA-GITYDSQQWTVDVYW-NKEGGGFEVKYIVSTEVGQSEKKPVL 178 

50 Query: 166 FKNSLDSTTLTVKKKVSGTGGDRSKDFNFGLTLKANQYYKASEKVMIEKTTKGGQAPVQT 225 
FKNS D+T+L ++K+V+G G+ + F+F L L N+ + EK + +GG+ 
Sbjct: 179 FKNSFDTTSLKIEKQVTGNTGEHQRLFSFTLLLTPNECF EKGQWNILQGGETK 232 

Query: 226 EASIDQLYHPTLKDGESIKVTHLPVGVDYVVTEDDYKSEKYTrNVEVSPQDGAVKNIAGN 285 
55 ■ + I + Y PTLKD S+ ++ LPVG++Y +TE+D + Y T+ + + + G 

' Sbjct: 233 KWIGEEYSFTLKDKGSVTLSQIiPVGIEYKLTEEDVTKDGYKTSATLKDGEQSSTYELGK 292 



45 



Query: 286 STEQETSTDKDMTITFTNKKDFEVPTGVAMTVAPYIALGIVAVGGALYFVKKKNA 340 
+ + S D+ I TNK+D +VPTGV T+AP+ L IVA+GG +Y K+K A 
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Sbjct: 293 DHICrDKSADE---IVVTNKRDTQVPTGWGTLAPFAVLSIVAIGGVIYITKRKKA 344 

Based on this analysis, it was predicted that tiiis GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

5 Example 2389 

A DNA sequence (GASxl28) was identified in S.pyogenes <SEQ ID 7277> which encodes the amino acid 
sequence <SEQ ID 7278>. Analysis of this protein sequence reveals the foUovraig: 

Possible site: 44 

10 >>> Seems to have a cleavable N-term signal seq. 

Final Results 

bacterial outside Certainty=0. 3000 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

15 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAC97152 GB:U49397 unknown [Streptococcus pyogenes] 
20 Identities = 115/240 (47%) , Positives = 178/240 (73%) , Gaps = 3/240 (1%) 



25 



Query: 1 MIVRLIKLLDKIjIOTIVLCFFFLCLLIAaLGIYDALTVYQGANATNyQQyKKKGVQ--FD 58 

M++ ++++++K 1+ ++L F + L +A G++D+ H-YQ A+A+N++++K Q F+ 
Sbjct: 351 ^mMTIVQVINKAII)TLILIFCLVVLFrAGFGLWDSyHLYQQarftSNFKKFKTAQQQPKFE 410 

Query: 59 DLIAINSDVmWLTVRSTHIDYPIVQGENNLEYINKSVEGEYSIiSGSVFLDYRNKVTFED 118 

DLLA+N DV+ WL + GTHIDYP+VQG+ NLEYINK+V+G ++SGS+FLD RN F D 
Sbjct: 411 DLIjRtNEDVIGWLNIPGTHIDYPLVQGKTNLEYINKRVDGSVAMSGSLFIOT 470 

30 Query: 119 KySLIYAHHMAGNVMFGELPNFRKKSPENKHKEFSIETKTKQKLKINIFACIQTDAFDSL 178 

YSLIY HHMAGN MFGE+P F KK+PFNKH + lETK ++10. + IFAC++TDAFD L 
Sbjct: 471 DYSLIYGHHTffiGNBMFGEIPKFLKKNFFNKHNKAIIETKERKKLTVTIFACLKTO^ 530 

Query: 179 LENPIDV-DISSKNEFIOTIKQKSVQYREILTTNESRFVMiSTCEDMTTDGRIIVIGQIE 237 

35 +FNP + + + + +++I ++S Q++ + + ++FVa STCE+ +TD R+IV+G 1+ 

Sbjct: 531 VFNPNAITNQDQQRQLVDYISKRSKQFKPVKLKHHTKFVAFSTCENFSTDNRVIWGTIQ 590 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useflil 
antigens for vaccines or diagnostics. 

40 Example 2390 

A DNA sequence (GASxl29) was identified in S.pyogenes <SEQ ID 7279> which encodes the amino acid 
sequence <SEQ ID 7280>. Analysis of this protein sequence reveals the following: 

Possible site: 26 

45 »> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -6.05 Transmembrane 5 - 21 ( 4-22) 
INTEGRAL Likelihood = -5.04 Transmembrane 191 - 207 ( 186 - 209) 

Final Results 

50 bacterial membrane Certainty=0. 3421 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 , 0000 (Not Clear) < suco 



LPXTG motif: 181-186 

55 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAC97151 GB:U49397 unknown [Streptococcus pyogenes] 
Identities = 64/213 (30%) , Positives = 106/213 (49%) , Gaps = 20/213 (9%) 

5 





Query: 


1 


MKKSILRILAIGyLLMSFCLLDSVEAENLTASINIEVIN 6U 








M+K + ++ +L +V A++ T +IVN++A+ F + 




Sbjct: 


1 


MR]C™KMLFSVVmiiTMIiRFNQT\njUa3STVQTSISV^^ FSIAL 54 


10 


Query: 


61 


EAIjDKESPLENSVTTSVKGNGKTSFEQLTFSEVGQYimCIHQIiLGKNSQYHTOET^ 120 








E++D + ++ G+GK SF L F+ VGQY Y+++Q +N Y D TV++V+ 




Sbjct: 


55 


ESIDAMKTIEE- - - ITIAGSGKASFSPIiNFTTVGQYTYRVYQKPSQNKDYQADTTVFDVL 111 




Query: 


121 


lYVLYNEQSGALETNLVSNKLGETEKSELIFKQEYSEKTPEPHQPDTTEKEKPQKKRNGI 180 


15 






+YV Y+E G L ++S + G+ EKS + FK + K P QPD + 




Sbjct: 


112 


VYVTYDE-DGTLVAKVISRRAGDEEKSAITFKPKRLVKPIPPRQPDIPRrP 161 




Query: 


181 


LPSTGEMVSYVSALGIVLVaTITLYSIYKKLKT 213 








LP GE+ S + L IVL+ + L + KKLK+ 


20 


Sbjct: 


162 


LPLAGEVKSLLGILSIVLLGLLVLLYV-KKLKS 193 




Based on this 


analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 



antigens for vaccines or diagnostics. 



Example 2391 

25 A DNA sequence (GASxlSOR) was identified in S.pyogenes <SEQ ID 7281> which encodes the amino acid 
sequence <SEQ ID 7282>. Analysis of this protein sequence reveals the following: 

Possible site: 57 

»> Seems to have no N-terminal signal sequence 

30 

Final Results 

bacterial cytoplasm Certainty=0 . 1614 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

35 

No corresponding DNA sequence was identified ia S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CRB54046 GB:AJ245436 hypothetical protein, 57.8 ItD [Pseudomonas 
putida] 

40 Identities = 128/388 (32%) , Positives = 204/388 (51%) , Gaps = 21/388 (5%) 



45 



Query: 4 IGSWQRQELVFIPAQLKRINHVQHAYKCQTCSDNSLSDKIIKAPVPKAPLAHSLGSASI 63 

IG V Q L +P Q++ I HV+ Y C+ C ++ A P + S+ S S+ 

Sbjct: 126 IGEEVSEQ-LEIVPMQIRVIKHVRKVYGCRDCESAPVT ADKPAQMIEKSMASPSV 179 

Query: 64 lAHTVHQKFTLKVPNYRQEEDVraKLGLSISRKEIANTOIKSSQYYFEPLYDIJlJlDIIiLSQ 123 

+A + K+ +P +R E+ + G+ I R+ +A W 1+ S++ F+PL +L+R+ LL+ 
Sbjct: 180 LAMLLTTKYVDGLPLHRFEKVLGRHGIDIPRQTrARWIQCSEH-FQPLIjNLMRESLIiMS 238 

50 Query: 124 EVIHADETSYRVLESD TQLTYYWTFIiSGKHEKKGITLYHHDKRRSGLVTQEVLGDY 179 

+IH DET +VL+ + ++ W G ++ + L+ + R+ V +L Y 

Sbjct: 239 RIIHCDETRVQVLKEPGREPSSQSWMWVQTGGPPDRP-VILEDYATSRAQEVPVRLLDGY 297 



Query: 180 SGYVHCD^ffiGAYRQL---EHAKLVGCWaHVRRKFFEATPKQAD-Ia'SLGRKGLVYCDIa:lF 235 
55 GYV D + Y L + + +GCWaH RRKF EA Q KT h h-KLh- 

Sbjct: 298 RGYVMTDDYAGYNALftAQDGLERIXSCWAHaRRKFVEAQKVQPKGKTCRADIAI^ 357 

Query: 236 ALEAEWCELPPQERLVKRKEILTPLMTTFFDWCR--EQWLSGSKLGLAIAYSLKHERTF 293 
+E + + ++R V R E PL+T +W + V+ + LGAIY + 
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Sbjct: 358 GVERDLKDSDDEDRKV3VR^ffiRSLPLLTQLKHWVEKTQPQVTTQNM/3KA.lGYIASm 417 

Query: 294 RTVLEDGHIVLSNNMAERAIKSLVMGRK2WLFSQSFEGAKAAAIIMSLLETAKRHGLNSE 353 

+E G++ + NN AERAI+ V+GRKNWLFS + +GA A+A + SL+ETAK +G 
Sbjct: 418 ERyVEHGyLP^©NMAaERAIRPFVIGRKl#^JFSDTPK^ 477 

Query: 354 KYISYLLDRLENEETLAKREVLEAyLPW 381 

++ + L+RLP ++ E EA. LPW 
Sbjct: 478 AWLRHALERI:jPQACSV---EDYEALLPW 502 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2392 

A DNA sequence (GASxlSlR) was identified in S.pyogenes <SEQ ID 7283> which encodes the amino acid 
15 sequence <SEQ ID 7284>. Analysis of this protein sequence reveals the following: 

Possible site: 37 

»> Seems to have no N-terminal signal sequence 

20 Final Results 

bacterial cytoplasm Certainty=0. 4465 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

25 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2393 

30 A DNA sequence (GASxl32R) was identified in S.pyogenes <SEQ ID 7285> which encodes the amino acid 
sequence <SEQ ID 7286>. Analysis of this protein sequence reveals the following: 

Possible site: 46 

»> Seems to have no N-terminal signal sequence 

35 



40 



45 



Pinal Results 

bacterial cytoplasm Certaintyi=0. 1529 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certaintyi=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAA84885 GB:AB024946 orfSO [Escherichia coli] 
Identities = 37/91 (40%), Positives = 53/91 (57%) 

Query: 10 QVYLVa3KTDMRQGIDSIATt.VKSQHEI£)LFSGAVYLPCGGia?DRFKRLm)GQGF^ 69 

+++LV G TDMR G + LA V++ + D FSG +++F G R D+ K L+ D G L 
Sbjct: 9 RIVffiVAGITDMRNGFNGMSKVQimKDDPFSGHiFIFRGRRGDQIKjn^Wft^ 68 



50 



Query: 70 KRFENGKLAWPRNRDEVKCLTAVQVDWLMKG 100 

KR E G+ WP RD LT Q+ L++G 
Sbjct: 69 KRLERGRFVWPVTRDGKVHLTPAQLSMLLEG 99 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefUl 
antigens for vaccines or diagnostics. 

Example 2394 

A DNA sequence (GASxl33R) was identified in S.pyogenes <SEQ ID 7287> which encodes the amino acid 
5 sequence <SEQ ID 7288>. Analysis of this protein sequence reveals the following: 

Possible site: 18 

>» Seems to have no N-terminal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0 . 1979 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2395 

20 A DNA sequence (GASxl35R) was identified in S.pyogenes <SEQ ID 7289> which encodes the amino acid 
sequence <SEQ ID 7290>. Analysis of this protein sequence reveals the following: 

Possible site: 20 

>» Seems to have a cleavable N-term signal seq. 

25 

Final Results 

bacterial outside Certainty=0.3000 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty= 0.0000 (Not Clear) < suco 

30 

No corresponding DNA sequence was identified ia S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

35 Example 2396 

A DNA sequence (GASxl36) was identified in S.pyogenes <SEQ ID 7291> which encodes the amino acid 
sequence <SEQ ID 7292>. Analysis of this protein sequence reveals the following: 

Possible site: 54 
40 »> Seems to have no N-terminal signal sequence 
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Final Results 

bacterial membrane Certainty=0 . 5692 (Affirmative) < suco 

bacterial outside Certainty^O . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

5 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:BaB04077 GB:AP001508 short-chain fatty acids transporter 

[Bacillus halodurans] 

10 Identities = 158/465 (33%) , Positives = 248/465 (52%) , Gaps = 41/465 (8%) 

Query: 15 IKTKKRFMDRYIDGFMKWMPESLFICFILTFLWTMSVLMTDSPFIGTEKTGGIIYGWVN 74 

I R M RY+ P+ +LTFLV +S++ T+S T T 1+ W 
Sbjct: 5 ISLSNRLMQRYL PDPFLFWLLTFLVFALSLIFTES TPLT- - IVQYWGE 51 

15 

Query: 75 GFWGLLSFMQMTILLATGmVASSPPJiHKMFKSLAKLPQTRTQIFIFSIVVGSIPGFLH 134 

GFWGLLSF+MQM ++L TG+ +ASSP K +LA LP + Q + W + F++ 
Sbjct: 52 GFWGLLSFSMQMVLVLVTGHVLASSPLFKKGLGALAGLPASPGQAILLVTWSLVASFIN 111 

20 Query: 135 WGIX31#WAIVFGKELLVQARQKGIKVHTPLFVATLFFTFLPATSGLSGftAVLYSATPDYL 194 
WG G+++ +F KEL +K V L +A+ + F+ GLSG+ L ATPD+ 
Sbjct: 112 WGFGLVIGALFAKELA KKVENVDYRLLIASAYSGFMIWHGGLSGSVPLTIATPDHP 167 

Query: 195 imSVanAYKQVVPESVPLTESVL---NLPFISLI,WCMLVPLCFALLaHPKDETKIME-- 249 
25 + +P +E++ NL + L + +PL L+ K +T ++ 

Sbjct: 168 AQDMIGV IPTSETIFAPYISn^IWALFIA- - IPLANRIJffllPGKSDTVTVDRS 217 

Query: 250 -LDDEIYHHSLDTASHWIAR]Sr^PAEKM^IASRLVMYLVGGAIVSYSLYHFSVVGLS^^ 308 
LDD L AS + + TP++++ SR++ LVG + + Y+F+ G B+L 

30 Sbjct: 218 LI^DG---RDLQAAS-I£]:jEAMTPSDRLENSRMISLLyGVIiGIiVFLGYYFATNGFE-IJi& 272 

Query: 309 NCFNFLFLGLGLLLCGQQGPEYYGSLFKDGVMSSWGLVLQFPFYAGIFGIIQSTGLGLEI 368 

+ N LFL LG+L G P+ + V + G+++QFPFYAG+ GI+ S+GL + 

Sbjct: 273 DlVNSLFLFLGILFHGT--PKLFLKAVTSAVKGASGIIIQFPFYAGLMGlMVSSGLA'rVM 330 

35 

Query: 369 SHFFVAISNGTTWPVFAYLYSALLNIAVPSGGSKFVIEAPYIVPATIEVGNDLGKILQAY 428 

S PV+ SN T+P+F +L + ++N+ VPSGG ++ ++AP ++ A +G K A 
Sbjct: 331 SEAFVSPSNEVTFPLFVFIiSAGIVNVFVPSGGGQtJAVQAPVVLEAfiQSLGVPlU^K^^ 390 

40 Query: 429 QLGDATTNLIVPFWALSYLSNFKLKFNQIVAYTIPCVLWTGIAI 473 

GDA TN+I PFWAL L+ LK 1+ + + +LW+G+ I 
Sbjct: 391 AWGDAWTNMIQPFWALPALAIAGLKAKDIMGFCV-MILWSGWI 434 ' 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, coxild be usefUl 
45 antigens for vaccines or diagnostics. 

Example 2397 

A DNA sequence (GASxl37R) was identified in S.pyogenes <SEQ ID 7293> which encodes the amino acid 
sequence <SEQ ID 7294>. Analysis of this protein sequence reveals the following: 

Possible site: 58 

>» Seems to have no N-terminal signal sequence 



50 



Final Results 

bacterial cytoplasm Certainty=0. 2591 (Affirmative) < suco 

55 bacterial itienibrane — Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 



The protein has homology with the following sequences in the GENPEPT database: 



wo 02/34771 



PCT/GBOl/04789 



-2626- 

>GP:AAC22434 GB:U32761 transcriptional regulator [Haemophilus influenzae Rd] 
Identities = 37/107 (34%) , Positives = 56/107 (5X%) , Gaps = 1/107 (0%) 

Query: 21 LHRQNLVTPDKTFMIOTiQLTTLFEERNSLPVVKCySASWDPljaiCTRYS-SYLTILPRPI 79 

LH+Q + FD+TFMI+H L PE N P + S+ WDFLL+ + + LTILP P+ 
Sbjct: 205 LHQQK^ffiIPDQTFMIHHHLKEAFERmCTPDIVLDSS<M3FLLSAVKraKELLT 264 

Query: 80 THFAHMDGLVEVQLTEHPKWEWLASLKHNKTSHLKHYIKHTILDYF 126 

H + ++ W+V L + +HL+ YI +L+ F 

Sbjct: 265 AELYHSKEFLCRKIESPVPWKOTLCRQRKTVYTHLEEYIFDKLLEAF 311 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useftil 
antigens for vaccines or diagnostics. 

Example 2398 

A DNA sequence (GASxHO) was identified in S.pyogenes <SEQ ID 7295> which encodes the amino acid 
sequence <SEQ ID 7296>. Analysis of this protein sequence reveals the following: 

Possible site: 50 

>>> Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 3351 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 00,00 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

!GB:U32761 acetate CoA-transferase, alpha subunit [H. . . 215 4e-55 
Identities = 105/213 (49%) , Positives = 146/213 (68%) 



(Juery: 


22 


ENKRIAIAEAISHIKDGDTIMVGGFMANGTPEALinALVDKGTKDLTLICNDaGPVDRGV 


81 






+ K + + +A +DG TIMVGGPM GTP L++AL++ G +DLTLI ND PVD G+ 




Sbjct: 


2 


KTKUmiQDATGFPRMMTIMVGGFMGIGTPSRLVEALLESGVRDLTLIANDTAFV^ 


61 


Query: 


82 


GKMVANHQFKTIYATHIGIiNKERGRQMTAGETTIELIPQGTFAEKIRIGAYGIGGFYTPT 


141 






G ++ N + + + A+HIG N E GR+M +GE + L+PQGT E+IR G G+GGF TPT 




Sbjct: 


62 


GPLIVNGRVRKVIASHIGTNPETGRRMISGEMDWLVPQGTLIEQIRCGGAGLGGPLTPT 


121 


Query: 


142 


GVGTLVAEGKETKTIKGKTVLLEYPFEADVALIPANQADEMGNLQYSGSENNPNQLMAAC 


■201 






GVGT+V EGK+T T+ GKT+LLE P AD+ALI A++ D +GNL Y S NPN L+A 




Sbjct: 


122 


GVGTVVEEGKQTLTIJ3GKaWiLERPIiRADI>ALIRAHRCDTLGNLTYQLSARNPNP^ 181 


Query: 


202 


AKTTIVQAREIVPVGTIQPECVHTPHIFVDYIV 234 








A T+V+ E+V G +QP+ + TP +D+I+ 




Sbjct: 


182 


ADITLVEPDELVETGELQPDHIVTPGAVIDHII 214 








subunit (EC 2.8.3.-). [Escherichia coli] 





Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2399 

A DNA sequence (GASxl41) was identified in S.pyogenes <SEQ ID 7297> which encodes the amino acid 
sequence <SEQ ID 7298>, Analysis of this protein sequence reveals the following: 

possible site: 41 



?» Seems to have no N-terminal signal sequence 
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Final Results 

bacterial cytoplasm Certaiiity=0 .4941 (Affirmative) < suco 

bacterial meiribrane — - Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — CertaintY=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAF12248 GB:AE001862 CoA transferase, subunit B [Deinococcus radiodurans] 
Identities = 114/203 (56%), Positives = 158/203 (77%), Gaps = 3/203 (1%) 



Query: 11 QNRIAKRVAKELEDGTLVNLGIGLPTKVANFVPEEMTVYFQSENGFIGLGP--KSDDPNS 68 

++ +A R A+EL+DG VHMIGLPT VAN +P M+V+ QSENG +G+GP D+ + 
Sbjct: 5 RDEMaARAAQELCPGYYVNICIGLPTLVaNHIPAGMSVWLQSENGLLGIGPFPl^ 64 

Query: 69 TIVNaGGQPVTVYPGAAFENSflDSFGIIRGGHVDLTVLGSiLEIAENGDIfl^ 128 

++NaG Q VT PGA+FP+SflDSF +IRG6HV+L +I,Ga++++E GD+AN++IPGKMV 
Sbjct: 65 DLIiaGKQTVTALPGaSFFSSaDSFAMIRGGHVNLAILGaMQVSETGDIJ^^ 124 

Query: 129 GMGGAMDLLVGAKKVIVAMEHTNKG-KHKLLKECTLPLTAKGWDLIITEMGVFKVTPDG 187 

GMGGAMDL+ G ++V+V MEH KG HK+L+ECTLPLT +GWD IIT++GV VTP G 
Sbjct: 125 GMGGftMDLVAGVQRVVVIl^ffiHVRKBDftHKIIlRECTLPIlTGQGWDRIITDLGVLDVTPQG 184 

Query: 188 IQVIEISEGFTFDEVQAATGVPL 210 

++++E++ G T DE++ TG + 
Sbjct: 185 LKLVELAPGVTLDELRQKTGADI 207 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2400 

A DNA sequence (GASxl44) was identified in S.pyogenes <SEQ YD 7299> which encodes the amino acid 
sequence <SEQ ID 7300>. Analysis of this protein sequence reveals the following: 

Possible site: 39 

»> Seems to have no N- terminal signal sequence 



Final Results 

bacterial cytoplasm 

bacterial membrane — 
bacterial outside — 



Certainty=0 .3227 (Affirmative) < suco 
Certainty=0 . 0000 (Not Clear) < suco 
Certainty=0. 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:BftA29948 GB:AP000003 137aa long hypothetical protein [Pyrococcus 
horikoshii] 

Identities = 49/113 (43%) , Positives = 71/113 (62%) , Gaps = 1/113 (0%) 

Query: 5 PEPMGPySTYTIEGHFLYTAGQLPLNPVTGQLSDG-FEAQCRQVFVNLQSILAEQKLDLN 63 

P+P+GPYS G+FL+ AGQ+P++P TG++ G + Q RQV N+++IL LN 

Sbjct: 22 PKPIGPySQAIKaG^ffLFIAGQIPIDPIOTGEIVKGDIKDQTRQVLENIKaILEAaGYSLN 81 

Query: 64 HIYKIJmLTDVTNVEIIJi^HV^m)LFEEPYPVRTAVQVSaLPLQALIE^ 116 

+ K+ VYL D+ + +N V + F E PR AV+VS LP LIE+EA+A 
Sbjct: 82 DVIKVTVYLKDMfflDFAKMNEVYAEYFGESKPARVAVEVSRLPKDVLIEIEAIA 134 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useiul 
antigens for vaccines or diagnostics. 
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Example 2401 

A DNA sequence (GASxl46) was identified in S.pyogenes <SEQ ID 7301 > which encodes the amino acid 
sequence <SEQ ID 7302>. Analysis of this protein sequence reveals the foUowuig: 

Possible site: 16 

5 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 1238 (Affirmative) < suco 

10 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

15 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2402 

A DNA sequence (GASxl47) was identified in S.pyogenes <SEQ ID 7303> which encodes the amino acid 
sequence <SEQ ID 7304>. Analysis of this protein sequence reveals the following: 

20 Possible site: 30 



»> Seems to have no N-terminal signal sequence 
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25 



30 

Final Results 

bacterial membrane Certainty5=0.5585 (Affiimiative) < suco 

bacterial outside — Certaintyi=0 . 0000 (Not Clear) < suco 
35 bacterial cytoplasm — Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAA04270 GB:D17462 Na+ -ATPase subunit I [Enterocoocus hirae] 
40 Identities = 232/681 (34%) , Positives = 370/681 (54%) , Gaps = 40/681 (5%) 

Query: 1 MAISQMKKLAMVFEKDYLDLVLKTLQQSQLVEVRDMKQLKH WQDAETJKGNVKLPQIV 57 

MA+++M+K+ ++ +K +++L+ +Q VE+RD+ Q W + F P+++ 
Sbjct: 1 MAVTKMEKVTLISDKKNREILLQAVQGLHAVEIRDLFQESENNQWVETF FPEPEMI 56 

45 

Query: 58 QYDLTHQKPLLDDEALQYLLQSQQELENGLASLSAFLPPIGKLTALRQ--KTPSLSFRQF 115 

D K LYL ++F+G+++QKLS 
Sbjct: 57 DKDKELAK LSYKLTD IRTAIQFIEHHGEKSQKKQHLKRRELSLDTL 102 

50 Query: 116 EERHRQQAAQTALKMMSQKIERLEQLQSKIDQLTEYCQELEKWRSLTVLPQDLAQEHPLS 175 

E+ + ++A L+ + E+ EQL + QL + L W++L + P+ 
Sbjct: 103 EKNYSEEAFSKKLEEVLLLKEQWEQLVDERQQLEDQENWLUWQNLDLAPKAFDS-QMTK 161 

Query: 176 ARVGTIPSTANNHFYHQLKQHKGLFIEEVYH TEFEYGLVLFWQAQDTIHLQKYQFK 231 

55 +GT+ + F ++ + ++EE+ T F Y ++ +++ +Y F 

Sbjct: 162 LVIGTVNAKNAESFKAEVAEINEAYLEEINSSPTTTYFAYIVLRADESRMEEITiSRYGFV 221 



wo 02/34771 



PCT/GBOl/04789 



-2629- 



10 



iQKTSKRLVTPFNILa.ISVAIWGLIYGSFFG- FDLPVRLLSTETDVITIL 460 

15 L + +R FP ILAI IWG lY SFPG LP +LST DV TIL 



20 



25 



30 



50 



55 



Query: 


232 


Sb j ct : 


222 


Query: 


292 


Sb j ct : 


282 


Query: 


350 


Sbj ct: 


342 


Query: 


410 


Sbjct: 


402 


Query: 


461 


Sbjct: 


462 


Query: 


521 


Sbjct: 


522 


Query: 


580 


Sbjct: 


582 


Query: 


640 


Sbjct: 


642 



Y + P +QL K+ L ++ L + + + + L++ +R+ 



K +++ T +LI ++GW++ + +L ++ L ++D D+ E+VP KL+ 



NH +APFE++TEMY+LPKY+E DPTP++ P YL FEGMMVAD+GYGLL+H- 



++S++FG + ++ GL + A + ++ KAY A AW +ILLG++L +LG 



+G LA+ +A IL++ + +S S G+ G YNLYG++ Y+ DLVS+TRLMAIG+SG 



SI AAFNM+V PP RF+VGI + I+L A+N+FL++LS YVHGARL +VEFFGKFy 



GGG++F PLK + YVN+N + 



Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
35 antigens for vaccines or diagnostics. 

Example 2403 

A DNA sequence (GASxl48) was identified in S.pyogenes <SEQ ID 7305> which encodes the amino acid 
sequence <SEQ ID 7306>. Analysis of this protein sequence reveals the following: 

Possible site: 40 

40 

>>> Seems to have no N- terminal signal sequence 

IlSrrEGRAL Likelihood = -7.80 Transmembrane 28 - 44 ( 21 - 51) 

INTEGRAL Likelihood = -6.85 Transmembrane 148 - 164 ( 146 - 170) 

INTEGRAL Likelihood = -2.81 Transmembrane 105 - 121 ( 105 - 123) 

45 



Final Results 

bacterial membrane Certainty^O .4121(Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAA03841 GB:D16334 Na+-ATPase K subunit [Enterococcus hirae] 
Identities = 85/150 (56%) , Positives = 107/150 (70%) 

Query: 20 HYFTAHGGVFFAALGIVLAVALSGMGSAYGVGKGGQAAAALLKEEPEKFTSALILQLLPG 79 

+ T +GG+ FA L + A SG+GSA GVG G+AAAAL +PEKF ALILQLLPG 
Sbjct: 4 YLITQNGGMVFAVLAMATATIFSGIGSAKGVGMTGEAAAALTTSQPEKFGQALILQLLPG 63 



60 Query: 80 SQGIYGFAIGILIWMKLTPELSVNQGLAYFLVSLPIAIVGYFSAKHQGNVSVAGMQILAK 139 

+QG+YGF I LI++ L ++SV QGL + SLPIA G FS QG V+ AG+QILAK 
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Sbjct: 64 TQeaYGFViaFLIFlNIfiSDMSWQGUmjGASLPIAFTGLFSGiaQGro^^ 123 

Query: 140 RPKDFMKGVILftaMVETYAILAFWSFIIiL 169 

+P+ KG+I AAMVETYAIL PV+SP+IiH- 
Sbjct: 124 KPEHATKGIIFAAMVETYAILGFVISFLLV 153 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2404 

A DNA sequence (GASxl49) was identified in S.pyogenes <SEQ ID 7307> which encodes the amino acid 
sequence <SEQ ID 7308>. Analysis of this protein sequence reveals the following: 

Possible site: 55 

»> Seems to have no N-tertninal signal sequence 



Final Results 

bacterial cytoplasm Certaintyi=0. 4510 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAA04272 GB:D17462 Na+ -ATPase eubunit E [EnterococcuB hirae] 
Identities = 43/193 (22%) , Positives = 95/193 (48%) , Gaps = 2/193 (1%) 

Query: 1 VNDITQLRQNVLEKaHQEGQQCLKIATDSLDTDFKERQQQGLHDLKAKRQKEIiKALEQQF 60 

V+ I ++ + E A E ++ +D F+ ++ Q D + ++ +L+ +E+ + 

Sbjct: 3 VDAIDKIITQINETAQLERASFEEMKRKEIDQKFEVKKWQIEADFQKEKASKLEEIERSY 62 

Query: 61 QVAQQQLKNQERQAMALKQDSIKELFEASLEKMTNFSKEEELAFIiKQVLSKyP-EQPLQ 119 

+ + + K Q +Q +L , KQ+ ++ LF + ++ N KEE+LA +KQ++ P + 
Sbjct: 63 RQLRNKQKMQVRQEILNAKQEVLQRLFTEATLQLENEPKEEQIJSIiMKQMIQTLPINGTAR 122 

Query: 120 ^7TFGEKTGQKFSSYDCAEIlRLAFPQLS■raQELIPQ-EAGFLVSIJDQVDDNYLYRYLLESV 178 

+ GEK+ + AE P ++ + +AG ++ + N+L+ +L++ + 

Sbjct: 123 LIPGEKSADILTPAVIAEWNEELPFELIREDFTEKAQAGLIIDDaGIQYNFLFSHLIKEI 182 

Query: 179 LKEESSRIIDMLF 191 

+ S+ I LF 
Sbjct: 183 QETMSAEIAKELF 195 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2405 

A DNA sequence (GASxl50) was identified in S.pyogenes <SEQ ID 7309> which encodes the amino acid 
sequence <SEQ ID 7310>. Analysis of this protein sequence reveals the following: 

Possible site: 30 

>» Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0 .3095 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database; 

>GP:BaA04273 GB:D17462 Na+ -ATPase Slibunit C [Enterococcus hirae] 
Identities = 94/326 (28%) , Positives = 167/326 (50%) , Gaps = 5/326 (1%) 

5 

Query: 6 ElOTTISVKEKELLTKEQFDKlLQAPOT'TTIJiJlLLHQSTOHLTVDDLNDLDRLESILMAE 65 

ELN I +E EL++K+ F++++Q + +L +L ++Y + D D D E+ L E 
Sbjct: 5 EENPLIRGRELEIiISKDTFEQMIQTDSIDSLGEILQSTIYQPYIYDGFDKD-FEANLSQE 63 

10 Query: 66 LTKTYRWAPAETPQPDIVQLFTLRYTYHNVKVLIjKaKASQftDLSHLLLPIGDKPLVMiEH 125 

+K ++W P+P+IV ++T+RVT+HN+KVL Kft.+ + +L HL + G L L+ 
Sbjct: 64 RSKLPQWLKESAPEPEIWIYTMRyTFHOTiKVLTKaEITGQNLDHLYIHDGFYSLET^ 123 

CJuery: 126 LIRTMTSDEFPKEWTElQSIWAEYQDYQDIRVLEIGTDIAYFKALKQIAQRIiEDPVFQQ 185 
15 I T S E P ++ 1+ + ++ ++ +++ D + +++ ++L P + 

Sbjct: 124 AIHTQVSVELPDSLMDYIREVHEYCEESTILQGIDVIYDRCFLTEQRRLGEQLGYPELLE 183 

Query: 186 AVLIVIDLYNLITVRRAKSQNKPISF^MQLrJSDEaSRPSKTFITLEDDKDIMTWPENVTP 245 

++ IDL N+ T R Q++ FM ++S S P T ++ ++++ + + 

20 Sbjct: 184 EIIAFIDLTNITTTARGILQHRSAGFMTTVISSSGSIPKDTLLSFVRG-EMASFTQFLLT 242 

Query: 246 DSYMTALKPYSEKLRQGTLQTTELEYLVDECIjYHLFAKAKYQVDGPYVLARFLIiAKSFEV 305 

Y LK + + ++ LELD+L +A+QGPLFLAKE 
Sbjct: 243 TDYSELLK---QVIHEEQIDLVSLEQLKDDYLSSFYQVaQTQAFGPLPIiLAFLNaKEVES 299 

25 

Query: 306 KNLRLLftaALaNDLPKERVIERMRPI 331 

KNLRLIi N E++ ERMR ■)- 
Sbjct: 300 KNLRLLIIGKRNHFSLEQIiKERMRQV 325 

30 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, conld be useful 
antigens for vaccines or diagnostics. 

Example 2406 

A DNA sequence (GASxlSl) was identified in S.pyogenes <SEQ ID 7311> which encodes the amino acid 
sequence <SEQ ID 7312>. Analysis of this protein sequence reveals the following: 

35 Possible site: 29 

»> Seems to have no N-terminal signal sequence 

Final Results 

40 bacterial cytoplasm Certainty=0 . 0484 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
45 The protein has homology with the following sequences in the GENPEPT database: 

>6P:BAA04274 GB:D174e2 Na-f -ATPase subunit G [Enterococcus hirae] 
Identities = 45/101 (44%) , Positives = 65/101 (63%) 

Query: 6 YKVGVIGNRDVILPFQMIGFQTFFVIKPQDAINQLRQIiAMEDFGIIYITEDIAAAIPEAL 65 
50 YK+GV+G++D + PF++ GF + + ++A ++G+IYITE A +PE + 

Sbjct: 3 YKIGVVGDKDSVSPFRLFGFIWQHGTTKTEIRRriDEMAKNEYGVIYITEQCaNLVPETI 62 

Query: 66 THYDNQVLPAVIPLPTHQGAQGIGLSRIQAMVEKAVGQNIL 106 

Y Q+ PA+I +P+HQG GIGL IQ VEKAVGQNIL 

55 Sbjct: 63 ERYKGQLTPAIILIPSHQGTLGIGLEEIQNSVEKAVGQNIL 103 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, eould be useful 
antigens for vaccines or diagnostics. 

Example 2407 

A DNA sequence (GASxl52R) was identified in S.pyogenes <SEQ ID 7313> which encodes the amino acid 
5 sequence <SEQ ID 7314>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

>» Seems to have no N-terminal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0 . 1048 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2408 

20 A DNA sequence (GASxl56) was identified in S.pyogenes <SEQ ID 7315> which encodes the amino acid 
sequence <SEQ ID 73 1 6>: 

EYSI IPQLKETIHYIELKIiEEAERASLVRIMKITS 

Analysis of this protein sequence reveals the following: 

25 Possible site: 16 

»> Seems to have no H-terminal signal sequence 

Final Results 

30 bacterial cytoplasm Certaintyi=o . 5026 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Cleeir) < suco 

bacterial outside — Certaintyi=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
35 The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAA04277 GB:D17462 Na+ -ATPase subunit D [Enterococcus hirae] 
Identities = 119/201 (59%) , Positives = 151/201 (74%) , Gaps = 2/201 (0%) 

Query: 10 RIiNVKPTRMELSIOiKNRLKTATRGHKLLKDKRDEMaiFTO 69 
40 RLNV PTRMEL+ LK +L TATRGHKLLKDK+DELMR+F+ LIR+HNELRQ lEKE 

Sbjct: 2 RIJimiPTRMELTRLKKQLTTATRlGHKLLKDRQDEIjyiRQPIIiLIRkmEr^ 61 

CJuery: 70 MKEFVLAKASENSLMVEELFAVPVHEOTLWIDIENIMSVNVPKFHVQSNTAREQEQGEFA 129 
MK+FVLAK++ ++EL A+P V++ + +NIMSV VP + Q + + E 

45 Sbjct: 62 MKDFTOJ^TVEEAFIDELLALPAENVSISVVEKNIMSVKVPMJFQYDETIiN^ 119 

Query: 130 YSYLSSNSEWm'lQKTKELimLLRIJffiVEKTCQIjMADDIEKTRR^ 189 

Y YL SN+E+D +1 +Ii KIiL+LREVEKrCQrjyiA++IEKTRRRVN LEY IPQI1+ 
Sbjct: 120 YGYLHSNaEIJ3RSIIX;FTQLLPKI.LKI^VEKTCQLMftEEIEKTRRRVNALEYMTIPQLE 179 



50 



Query: 190 ETIHYIELKIiEERERJiSLVRI 210 
ETI+YI++KLEE ERA + R+ 
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Sbjct: 180 ETIYYIKMKLEENERAEVTRL 200 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefid 
antigens for vaccines or diagnostics. 

5 Example 2409 

A DNA sequence (GASxl61R) was identified in S.pyogenes <SEQ ID 7317> which encodes the amino acid 
sequence <SEQ ID 73 18>. Analysis of this protein sequence reveals the following: 

Possible site: 27 

10 »> Seems to have an uncleavable N-term signal seq 

Final Results 

bacterial membrane Certainty4=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

15 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
20 antigens for vaccines or diagnostics. 

Example 2410 

A DNA sequence (GASxl64) was identified in S.pyogenes <SEQ ID 7319> which encodes the amino acid 
sequence <SEQ ID 7320>. Analysis of this protein sequence reveals the following: 

Possible site: 36 

25 

»> Seems to have no N-terminal signal sequence . 

INTEGRMj Likelihood = -1.06 Transmeitibrane 9 - 25 ( 9 - 25) 

Pinal Results 

30 bacterial membrane Certainty=0 . 1426 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

A related sequence was also identified <SEQ ID 909 1> which encodes the amino acid sequence <SEQ ID 

35 9092>. Analysis of this protein sequence reveals the following: ' 

Possible cleavage site: 33 
>» Seems to have a cleavable N-term signal seq. 

Final Results 

40 bacterial outside Certainty= 0.300 (Affirmative) < suco 

bacterial membrane Certainty= 0.000 (Not Clear) < suco 

bacterial cytoplasm Certainty= 0.000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

45 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 
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Example 2411 

A DNA sequence (GASxl65) was identified in S.pyogenes <SEQ ID 7321> which encodes the amino acid 
sequence <SEQ ID 7322>. Analysis of this protein sequence reveals the following: 

Possible site: 59 

5 

>» Seems to have no N-terminal signal sequence 

Pinal Results 

bacterial cytoplasm Certainty=0 .2251 (Tiff irmative) < suco 

10 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

15 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2412 

A DNA sequence (GASxl66) was identified in S.pyogenes <SEQ ID 7323> which encodes the amino acid 
sequence <SEQ ID 7324>. Analysis of this protein sequence reveals the following: 

20 Possible site: 34 

»> Seems to have a cleavable N-term signal seq. 

Final Results 

25 bacterial outside — Certainty=0. 3000 (Affirmative) < suco 

bacterial membrane CertaintysO. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0.0000(Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

30 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2413 

A DNA sequence (GASxl67) was identified in S.pyogenes <SEQ ID 7325> which encodes the amino acid 
sequence <SEQ ID 7326>. Analysis of this protein sequence reveals the following: 

35 possible site: 31 

>» Seems to have a cleavable N-term signal seq. 

Final Results 

40 bacterial outside Certainty=0. 3 000 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

45 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 
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Example 2414 

A DNA sequence (GASxl68R) was identified in S.pyogenes <SEQ ID 7327> which encodes the amino acid 
sequence <SEQ ID 7328>. Analysis of this protein sequence reveals the following: 

Possible site: 22 

5 

»> Seems to have a cleavable N-term signal seq. 

Pinal Results 

bacterial outside Certaiiity=0. 3000 (Affirmative) < suco 

10 bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < .succ> 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

15 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2415 

A DNA sequence (GASxl69R) was identified in S.pyogenes <SEQ ID 7329> which encodes the amino acid 
sequence <SEQ ID 7330>. Analysis of this protein sequence reveals the following: 

20 Possible site: 31 

»> Seems to have a cleavable N-term signal seg. 

Final Results 

25 bacterial outside — Certainty=0. 3000 (Affirmative) < succ> 

bacterial membrane — CertaintyteO . OOOO (Not Clesir) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

30 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2416 

A DNA sequence (GASxl70) was identified in S.pyogenes <SEQ ID 733 1> which encodes the amino acid 
35 sequence <SEQ ID 7332>. Analysis of this protein sequence reveals the following: 

Possible site: 61 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -2.34 Transmembrane 154 - 170 ( 153 - 170) 

40 INTEGRAL Likelihood = -1.12 Transmembrane 20 - 36 ( 19 - 36) 

INTEGRAL Likelihood = -0.69 Transmembrane 52 - 68 ( 52 - 68) 

INTEGRAL Likelihood = -0.53 Transmembrane 399 - 415 ( 399 - 415) 

■ Final Results 

45 bacterial tneitibrane Certainty=0 .1935 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAB05347 GB:AP001512 cystathionine beta-lyase [Bacillus halodurans] 
Identities = 200/384 (52%), Positives = 262/384 (68%), Gaps = 3/384 (0%) 

5 Query: 79 lAEVYEMRENTTLLHGYTVIDEFTGftASVPIYQTSTFHNSELYCPSQKHLYTRFSNPTTE 138 

++E Y ++ T I)LH +D+ TGA SVPI STFH + + ,+ Y+R NPT + 
Sbjct: 1 MSEQYSLQ--TKIjLH15EHKVDQATGAVSVP1QHASTFHQFD-PDTPGTYDYSRSGNPTRD 57 

Query: 139 ALEDGLACLEKATYAVAYASGMAAISTVLMLLKAGDHVIFPLEVYGGTCQFATAILPNYQ 198 
10 ALE +A LE + A+ASGMAAIST MLL GDHV+ +VYGGT + T +L 

Sbjct: 58 ALEAAIAELEGGNHGFAFASGMAAISTAFMLLSKGDHWLTKDVYGGTFRLVTEVLTRLG 117 

Query: 199 lETSFVDMADLATVKRSIRPNTRMIYLETPSNPLLKICDISELVQLAKAYGVLTVA^ 258 
IE +FVDM +LA V A+IRPNTR++Y+ETPSNP L I DI +V LAK + LT DNTF 
15 Sbjct: 118 IEHTFVDMT^^^aEVaAAIRP]SlTRVLYMETPSNPTLNITDIR6WSLAKEHECLTFLDOT 177 

Query: 259 MTSLYQEPLAMGVDIVVESVTKFINGHSDWAGIJUVTNNEAIYNQLKLFQKNFGAIVGVE 318 

+T Q PL +GVD+V+ S TKFI GHSDWAGLA T NE + +L Q +FGAI+GV+ 
Sbjct: 178 LTPALQRPLELGVDVVIiHSATKFIGGHSDWAGIiAVTKIffiELGKKIiAFLQNSFGAILGVQ 237 

20 

Query: 319 raWLILRGMraMGIRMEQAVKNAQQLRNYLAKHPKVLKVHYPGL^^ 378 

D WL+LRG+KT+ +RME K AQQ+A +L P+V +V+YPGL HP H+ +QA+ 
Stajct: 238 DWLVLRGLKTLHVRMEHGEKGAQQIAEWLQGVPEVKRVYYPGLKDHPGHELQKRQAEGP 297 

25 Query: 379 GAVLSFELASKEELMTFTHRIQLPILAVSLGGVESILSHPATMSHACLSPQMLEQGWD 438 

GAVLSFEL ++E + F ++LP+ AVSLG VESILS+PA MSHA + + R +G+ D 
Sbjct: 298 GAVLSFELENEEAVRRFTOHVKLPVFAVSLGAVESILSYPAKMSHAAMPKEEREARGIRD 357 

Query: 439 GLLRLSCGVENIEDLLADFEQALA 462 
30 GLLRLS G+E E+L+ADF+ A A 

Sbjct: 358 GLLRIiSVGLEKPEEUflADFKRAFA 381 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

35 Example 2417 

A DNA sequence (GASxlVS) was identified in S.pyogenes <SEQ ID 7333> which encodes the amino acid 
sequence <SEQ ID 7334>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

40 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 1492 (Affirmative) < suco 

bacterial membrane Certainty=0. GOOD (Not Clear) < suco 

45 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
50 antigens for vaccines or diagnostics. 

Example 2418 

A DNA sequence (GASxl82) was identified in S.pyogenes <SEQ ID 7335> which encodes the amino acid 
sequence <SEQ ID 7336>. Analysis of this protein sequence reveals the following: 

Possible site: 22 

55 
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»> Seems to have no N- terminal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0 . 2584 (Affirmative) < suco 

5 bacterial membrane Certaintyi=0.0000{Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

10 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2419 

A DNA sequence (GASxl87) was identified in S.pyogenes <SEQ ID 7337> which encodes the amino acid 
sequence <SEQ ID 7338>. Analysis of this protein sequence reveals the following: 

15 Possible site: 61 

>» Seeios to have no N-terminal signal sequence 

Final Results 

20 bacterial cytoplasm — Certainty=0, 2084 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

25 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefid 
antigens for vaccines or diagnostics. 

Example 2420 

A DNA sequence (GASxl88) was identified in S.pyogenes <SEQ ID 7339> which encodes the amino acid 
sequence <SEQ ID 7340>. Analysis of this protein sequence reveals the following: 

30 Possible site: 34 

»> Seems to have no N-terminal signal sequence 

Pinal Results 

35 bacterial cytoplasm — Certainty=0 .2060 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in iS.ag^fl/acft'ae. 
40 The protein has homology with the following sequences in the GENPEPT database: 

>GP:iy^G05515 GB:AE004640 conserved hypothetical protein [Pseudomonas aeruginosa] 
Identities = 140/442 (31%) , Positives = 208/442 (46%) , Gaps = 73/442 (16%) 

CJuery: 2 KKYLNQNVYDALIERIiHFLFNDFPrVYISFSGGKDSGLLIiNIIiLDFRDKYYPDREIG 58 

45 K Y + +V+ A + RL +F +F V ++FSGGKDS + L + LD RE+G 

Sbjct: 4 KHYQDADVHAATLSRLRLVFRNFERVCVAFSGGKDSSVTLQLALDVA RELGRSP 57 

C2uery: 59 --VFHQDFEAQYSLTTKYVQETFTSLEGRKKVSLYWVCLPMATRTALSSYEMFWYPWDDK 116 
V D E QY T +V E GR V +WVCLP+ R A S E +W W+ 

50 Sbjct: 58 VDVLPIDLEGQYQATIDHVSEML GRPDVRPVJWVObPLNLRNASSLEEFYWCCWEPG 113 

Query: 117 TEDIWVRPMPSQDYVINLENNSITTYRYKmQEDLRKQFGRWYKQIHGNQKTVCILGNRA 176 
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E WVRP+P Q VI+ + YRY+M E+ F W + + T ++G R+ 



Sb j ot : 


114 


AEADWVRPLPKQRGVIS -DPAFFPFYRYRMEFEEFVAGENAWLRR- - -EEPTAFLVGIRS 


169 


Query: 


177 


SESLHRYSGFINKKYGYQKEC WITKQFKDVWTAS- -PLYDWSVEDIWH 


222 






ESL+RY K+ K+C W + + S P+YDW ED+W 




Sbjct: 


170 


DESLiNRYLAV--KraSRAKQCaWTPPGGSAPLAWSftRDRftNPQAVSFFPIYDWRFEDLWR 


227 


Query: 


223 


AYYKFSYSYNELYDLFSnCAGLKPSQMRVASPFQDYAVDSLNLYRIIDQETWVKIiLGRVQG 


282 






Y+YN LYD Y+AG+ SQMR+ P+ D L+L+ 1+ TW K++ RV G 




Sb j ct : 


228 


CVADHGYAYNRLYDQMYRAGVPFSQMRICQPYGDDQRKGLDLFHRIEPRTWFKVVRRVAG 


287 


Query: 


283 


VNFSNIYGRTKAMGYK-SIALPKGH-SWKSYTQFLLSTLPVRLRNNYVRKFNKSIDFWHK 


340 






N+ Y R + +GY+ + LP +W+ Y+QPLL ++P LR Y R+ + I +W + 




Sb j ct : 


288 


ANYGARYCRQRFLGYRGGLGLPPSFGTWREYSQFLLRSMPPPLRGIYQRRIERFILWWKQ 


347 


Query: 


341 


TGGGLAEETINELIEKBYRIARiraiSNYTSFKHSRVIFIMJ-IPDDTDDIVTTKDIPSWK 


399 






LA 1+ D IP + + PSW+ 




Sbjct: 


348 


HDYPLA IWPDAGIP ALENRREQPSWR 


373 


Query: 


400 


RMCFCILKNDHICRTMGFGLTR 421 








R+ +LK D + R++ FG ++ 




Sb j ct : 


374 


RIALSLLKQD-MARSLSFGFSQ 394 





Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2421 

A DNA sequence (GASxl89) was identified in S.pyogenes <SEQ ID 7341> which encodes the amino acid 
sequence <SEQ ID 7342>. Analysis of this protein sequence reveals the following: 

Possible site: 45 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 4121 (Affirmative) < suco 

bacterial membrane Cer1:ainty=b. 0000 (Not Clear) < suco 

bacterial outside — Certaintyi=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 



The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAC73702 GB:AE000165 orf, hypothetical protein [Escherichia 
coli] 

Identities = 79/162 (48%),- Positives = 110/162 (67%), Gaps = 1/162 (0%) 

Query: 7 PVYEIKSIPIEKISPNDYNENSVAPPEMiCLLYDSIKSDGYTMPIVCYYDKEEDRYSIVDG 66 

PV + + ++ PNDYNPN+VAPPE KLL SI+ DG+T PIV + +++ IVDG 
Sbjct: 46 PVDCVLWVKNSQLMPM)YNPN^reaPPEKKLIIQKSIEIIX3FTQPIVVTHT-DKNftM^ 104 

Query: 67 FHRYRIMLDYSDIYERESGRLPVSVIDKSLDYRM2^TIRHNRARGSHDVDLMSQIVKDLH 126 

FHR+ I S + R G LPV+ ++ + + R+A+TIRHNRARG H + MS+IV++L 
Sbjct: 105 PHRHEIGKGSSSLKLRLRGYLPVTCLEGTRNQRIAATIRHNRARGRHQITAMSEIVRELS 164 

Query: 127 ECGRSDNWIAKHLGMDKDEILRLKQITGLASLPKDHEENQSW 168 

+ G DN I K LGMD DE+LRLKQI GL LP D +++++W 
Sbjct: 165 QLGWDDNKIGKELGMDSDEVLRLKQINGLQELEADRQYSRAW 206 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefijl 
antigens for vaccines or diagnostics. 
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Example 2422 

A repeated DNA sequence (GASxl92R) was identified in S.pyogenes <SEQ ID 7343> which encodes the 
amino acid sequence <SEQ ID 7344>. Analysis of this protein sequence reveals the following: 

Possible site: 13 

5 

»> Seems to have no N- terminal signal sequence 

Pinal Results 

bacterial cytoplasm Certainty=0. 43 01 (Affirmative) < suco 

10 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

15 >GP:CAA63509 GB:X92946 transposase [Lactococcus lactis] 

Identities = 23/36 (63%) , Positives = 28/36 (76%) 

Query: 1 MQDKLVTEAFNQAYNREKPKEGVIVHTDQGSQYTGa 36 
MQDKLV + F QA +E P+ G+IVHTDQGSQYT + 
20 Sbjct: 134 MC2DKLVRDCFLQaa3KEHPQPGLIVHTDQGSQYTSS 169 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2423 

25 A DNA sequence (GASxl94R) was identified in S.pyogenes <SEQ ID 7345> which encodes the amino acid 
sequence <SEQ ID 7346>. Analysis of this protein sequence reveals the following: 

Possible site: 26 



30 



35 



40 



>» Seems to have an vincleavable N-term signal seq 

Final Results 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 
The protein has homology with the following sequences in the GENPEPT database: 

>6P:CftA63508 6B:X92946 hypothetical protein [Lactococcus lactis] 
Identities = 64/96 (66%) , Positives = 78/96 (80%) 

Query: 1 MPRKTFDKAFKLSAVKLILEEEQSVKMVSSTLEIHENSLYQWIQEYEKYGESAFPGHGSA 60 

M R+ FDK FK SAVKLILEE SVK VS LE+H NSIiY+W+QE E+YGESAFPG+G+A 
Sbjct: 1 MARRKFDKQFKNSAVKLILBEGySVKEVSQELEVHaNSLYRWVQEVEEYGESAFPGNGTA 60 

45 Query: 61 LRHAQFETKKLEKEHKLI^EErALLKKFQVFLKENR 96 

L +AQ + K LEKE++ LQEEL LLKKF+VFLK ++ 
Sbjct: 61 LANAQHKIKLLEKENRYLQEELELLKKFRVFLKRSK 96 



50 



Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, covild be useful 
antigens for vaccines or diagnostics. 
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Example 2424 

A DNA sequence (GASxl95R) was identified in S.pyogenes <SEQ ID 7347> which encodes the amino acid 
sequence <SEQ ID 7348>. Analysis of this protein sequence reveals the following: 

Possible site: 13 

>>> Seems to have a cleavable N-term signal seq. 
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Final Results 

bacterial membrane Certaintys=0. 5522 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAB75191 GB:AL139075 putative integral membrane protein 
[Campylobacter jejimi] 
Identities = 177/430 (41%) , Positives = 274/430 (63%) , Gaps = 8/430 (1%) 



Query: 


5 


IIISAIALAIGIGYRTKINIGLLAIAFSYLIATTLMGLSPKELLHFWPTSLFFTIPSVSL 


64 






+IIS+I +A1 +GY T+ N+G+ A+ F+Y+I M L+PK+++ FWP S+FF IF+VSL 




Sb j ct : 


6 


LIISSIIVAIILGYITRHNVGIFAMIFAYIIGAFFMDLAPKKIIAFWPISIFFVIFAVSL 


65 


Query: 


65 


FYNVATTNGTLDVLAQHILYRTRTHPNALYMILYLIATLLSALGAGFFTTMAVCCPLAIT 


124 






FYN AT NGTL+ LA H++YR HP L ++++++ +++AI1GAGF+T +A PL 




Sbjct: 


66 


FSlJFATVNGTLEKLAGHLMYRFANHPYLLPFVIFWSAIIAALQaGFYTVLAFMAPLTFL 


125 


Query: 


125 


LCQKADKHPLIGAQAVNWGASGGANLITSGSGIVFQGLFKQMGWE-EQAFSLGNHIFIVS 


183 






LC K + GA A+N+GA GGAN ITS SGI+F+GL + G E +AF+ + IF + 




Sb j ct : 


126 


La3KIGLSKIAGaMAINYGALGGANFITSQSGIIFR6LMENSGIEANEAFANSSIIFAFT 


185 


Query: 


184 


IIYPLIVLLLLSCYIRYSRGRTNSSLT-IDQPPVLSKVQRQTTLLMISSMVLVWLFPLLL 


242 






II P++VL + ++ + N ++ I +P Q+ T +LM +V+V +FP+L 




Sb j ct : 


186 


IILPIWL SFFVFNAFKNNIKISVISKPDPFDYKQKTTLILMFMMIVWLIFPVLN 


241 


Query: 


243 


LIFPNIAWIATYRQTFDIGFVSIIWCLALRLKLGKQEAILAKVPWAIIIMLCGMSLLMS 


302 






+IFP+ 1+ + + DI ++++ V +AL LKL ++ ++A +PW -(-IM+CG+ +L+S 




Sb j ct : 


242 


IIFPHNETISYFNKKIDIAMIAMIFVAIALFLKLADEKQWALIPWGTLIMICGVGMLIS 


301 


Query: 


303 


LAVKSGLVTLIGHLITTTIPHFWLPLFFCVIAGVMSLFSSTLSWAPTLFPIIATISAQS 


362 






+AV++G + L L-l- . I ++PL C lA MSLFSSTL W P LFPI+ -l-I+A S 




Sb j ct : 


302 


lAVEAGAIKLFSDLVENEINVIFIPLIMCAIAAFMSLFSSTLGWTPALFPIVPSIAASS 


361 


Query: 


363 


PHIDIRLLTTATIIGALSTNISPFSSAGSLIQLSLPHIEERSLAFKKQILLGVPISLSLA 


422 






+ LL + ++GA ++ ISPFSS GSLI S P + L FK ++ VPI A 




Sbjct: 


362 


-GLSEALLFSCIWGAQASAISPFSSGGSLILGSCPDKyKEKL-FKDLLIKAVPIGFIAA 


419 


Query: 


423 


LLTIWILMLL 432 








+L 1+ + 




Sbjct: 


420 


ILATIIMSFl 429 





Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 
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Example 2425 

A DNA sequence (GASxl96) was identified in S.pyogenes <SEQ ID 7349> which encodes the amino acid 
sequence <SEQ ID 7350>. Analysis of this protein sequence reveals the following: 

Possible site: 57 

5 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 0563 (Affirmative) < suco 

10 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

15 >6P:AAC45128 GB:U65510 nicotinate-nucleotide pyrophosphorylase 

[Rhodospirillum mbrum] 
Identities, = 116/277 (41%) , Positives = 170/277 (60%) , Gaps = 4/277 (1%) 

Query: 17 LTPFQIDDTLKaAIJlEDV-HSEDYSTI^IFDHHGQAKVSLFAKEAGVLAGLTVFQRVFTL 75 

20 L+PF ID+ ++ AL ED+ + D ++ A 4-A A++ G+LAGL + F L 

Sbjct: 10 LSPFAIDEAVRRALAEDLGRAGDITSTATIPAATRAHARFVARQPGIIiAGLGCARSAFAL 69 

Query: 76 FDTEVTFQNPHQFKDGDRLTSGDLVLEIIGSVRSLLTCERV2iiaiFLQHLSGlASMTARXV 135 
D VTF P +DG + +G V E+ G+ R++L ER AUSTFL HI1SGIA+ T + 
25 Sbjct: 70 LDDT\n:"FTTP--IiEDGaEIAaGQTWiEVaGAARTILaAEaiTAIiNFIX^^ 127 

Query: 136 EALGDDRIKVFDTRKTTPNLRLFEKYAVRVGGGYNHRFNLSDAIMLKDNHIAAVGSVQKA 195 

+A+ R ++ TRKTTP LR EKYAVR GGG NHRP L nA+++KDNHIA G V A 
Sbjct: 128 DAIAHTRARLTCTRKTTPGIiRGLEKyAWaBGGSNHRFGLDnAVLIKDlffi^ 187 

30 

Query: 196 lAQftRAYAPFVKMVEVEVESL-AAAEEAftaftfiVDIIMLDimSLEQIEQaim 254 

+4-+ARA + +E+EV++L AE A G ++++LDNM + +A+ ++AGR E 
Sbjct: 188 IlSRaRflGVGHMVRIEIEVDTLEQLAEVIJWGGaEVVLLD^I^mPTLTRATO 247 

35 Query: 255 CSGNIDMTTISRFRGLAIDYVSSGSLTHSAKSLDFSM 291 

SG + + TI+ +DY+S G+LTHS +LD + 

Sbjct: 248 ASGGVSLDTIAALAESGVDYISVGALTHSVTTLDIGL 284 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
40 antigens for vaccines or diagnostics. 

Example 2426 

A DNA sequence (GASxl99) was identified in S.pyogenes <SEQ ID 7351> which encodes the amino acid 
sequence <SEQ ID 7352>. Analysis of this protein sequence reveals the following: 

Possible site: 25 



45 



>>> Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm Certaintyi=0 . 1649 (Affirmative) < suco 

50 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2427 

A DNA sequence (GASx201) was identified in S.pyogenes <SEQ ID 7353> which encodes the amino acid 
5 sequence <SEQ ID 7354>. Analysis of this protein sequence reveals the following: 

Possible site: 19 

>» Seems to have an uncleavable N-term signal seq 

10 Final Results 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2428 

20 A DNA sequence (GASx203) was identified in S.pyogenes <SEQ ID 7355> which encodes the amino acid 
sequence <SEQ ID 7356>. Analysis of this protein sequence reveals the following: 

Possible site: 37 

»> Seems to have a cleavable N-term signal seq. 

25 

Final Results 

bacterial outside Certainty=0 .3000 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

30 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

35 Example 2429 

A DNA sequence (GASx210) was identified in S.pyogenes <SEQ ID 7357> which encodes the amino acid 
sequence <SEQ ID 7358>. Analysis of this protein sequence reveals the following: 

Possible site: 29 

40 »> Seems to have an uncleavable N-term signal seq 

Final Results 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

45 bacterial cytoplasm — Certainty^O. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
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10 



15 



25 



30 



The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2430 

A DNA sequence (GASx211) was identified in S.pyogenes <SEQ ID 7359> which encodes the amino acid 
sequence <SEQ ID 7360>. Analysis of this protein sequence reveals the following: 

Possible site: 24 

»> Seems to have a cleavable N-term signal seq. 



Final Results 

bacterial outside Certainty=0 .3000 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certaintyi=0. 0000 (Not Clear) < suco 

No con-esponding DNA sequence was identified in 5. aga/ac/zae. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2431 

20 A DNA sequence (GASx213) was identified in S.pyogenes <SEQ ID 7361> which encodes the amino acid 
sequence <SEQ ID 7362>. Analysis of this protein sequence reveals the following: 

Possible site: 14 



»> Seems to have no N-tei™inal signal sequence 



Pinal Results 

bacterial cytoplasm Certainty=0. 4430 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

35 Example 2432 

A DNA sequence (GASx219) was identified in S.pyogenes <SEQ ID 7363> which encodes the amino acid 
sequence <SEQ ID 7364>. Analysis of this protein sequence reveals the following: 

Possible site: 15 

40 »> Seems to have an uncleavable N-term signal seq 

Final Results 

bacterial membrane Certaintyi=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

45 bacterial cytoplasm — Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2433 

5 A DNA sequence (GASx220) was identified in S.pyogenes <SEQ ID 7365> which encodes the amino acid 
sequence <SEQ ID 7366>. Analysis of this protein sequence reveals the following: 

Possible site: 24 

>» Seems to have no N-termxnal signal sequence 

10 

Final Results 

bacterial cytoplasm Certainty=0 . 0530 (Affirmative) < suco 

bacterial menibrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside — Cer1;ainty=0 . 0000 (Not Clear) < suco 

15 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology witii any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useftil 
antigens for vaccines or diagnostics. 

20 Example 2434 

A DNA sequence (GASx231R) was identified in S.pyogenes <SEQ ID 7367> which encodes the amino acid 
sequence <SEQ ID 7368>. Analysis of this protein sequence reveals the following: 

Possible site: 30 

25 »> Seems to have an uncleavable N-term signal seg 

Final Results 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty^sO. 0000 (Not Clear) < suco 

30 bacterial cytoplasm CertaintyaO . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology Avith any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefid 
35 antigens for vaccines or diagnostics. 

Example 2435 

A DNA sequence (GASx237) was identified in S.pyogenes <SEQ ID 7369> which encodes the amino acid 
sequence <SEQ ID 7370>. Analysis of this protein sequence reveals the following: 

Possible site: 52 

40 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty^O .4961 (Affirmative) < suco 

45 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CaB49143 GB:AJ248283 hypothetical protein [Pyrococcus abyssi] 
Identities = 79/229 (34%) , Positives = 131/229 (56%) , Gaps = 11/229 (4%) 

Query: 18 MRFTIDQISMQFPLVEIDLEHGGSWLQQGSlWYHTENOTIiOTKLNGKGSGLGKLVGAIGR 77 

M + 1+ F L+E++L G +V + G+MVY V++ TK G L+GA+ R 

Sbjct: 1 MEYRIEHRPSFSLLEVNLREGEAVQAKAGAMVYMDPTVSIETKARGG LLGALKR 54 

Query: 78 SMVSGESMFITQAMSNGDGKIJUIJ^NTPGQIVALELGEKQyRIOTGAFIALDGSAQYKK^ 137 

S++ GES F+ + G G++ AP PG I++LEL Y GAFL ++ 
Sbjct: 55 SVIiGGESFFMN— VFRGPGRVGFAPGYPGDIISLELNGTLYA-QSGAFLVASEGIDIDVK 111 

Query: 138 RQNIGKftLFGGQGGLFVMTTEGLGTLLANSFGSIKKITLDGGTMTIDNAHWAWSRELDY 197 

6K +FG +G +F++ +G G + +S+G+I+KITL G ++ +D H+VA++ +D+ 
Sbjct: 112 FGG-GKTIFGREG-VFLLELRGKGIVFLSSYGa.IEKITLRGESVIVDTGHMVaFTEGIDF 169 

Query: 198 DIHLENGFMQSIGTGEGVWTFRGHGEIYIQSLNLEQPAfiTLKRYIiPTS 246 

I G ++ +GEG+V F GHG++YIQ+ F + +LP S 

Sbjct: 170 RIRKIGGLKATLFS6EGLVFEFSGHGDVYIQTRSLDGFLSWILPHLPKS 218 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2436 

A DNA sequence (GASx240R) was identified in S.pyogenes <SEQ ID 737 1> which encodes the amino acid 
sequence <SEQ ID 7372>. Analysis of this protein sequence reveals the following: 

Possible site: 35 

»> Seems to have no N-tertninal signal sequence 

Final Results 

bacterial cytoplasm Certainty=Q .2745 (Affirmative) < suco 

bacterial metnbrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2437 

A DNA sequence (GASx241) was identified in S.pyogenes <SEQ ID 7373> which encodes the amino acid 
sequence <SEQ ID 7374>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

>>> Seems to have an uncleavable N-term signal seq 
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Final Results 

bacterial membrane Certainty=0 . 5055 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAC10175 GB:AJ278302 histidine kinase [Streptococcus pneumoniae] 
Identities = 136/449 (30%) , Positives = 234/449 (51%) , Gaps = 26/449 (5%) 

Query: 8 FLLLSIIVYYMTKIYIFSFLSDITLP VWKQLTI-LALALFFNQFPYLS PLLI 58 

++LL +V + KI IF + I+L ++K + LA+ F Y+ + 

Sbjct: 5 WILLYTLVTHGLKIVIFFKVDGISLTFERIFKAFLFKILIAWFGMLGYMVGNVYLSYFM 64 

15 Query: 59 DPL LFLWLRQETKQLFSLKALFLAVAPSVLVDLLSRFMGTIVIPyLFLSSGIYLG 114 

+PL L ++LR+ K+L LF + P +LV+L R + V+P FL G 

Sbjct: 65 EPLYGIGLSPLLLKELPKKLL LFYGLFPMILVNLFYRGVSYFVLP--FLGQGQVYD 118 

Query: 115 HIIFDLLAYLIilFPSFAIINYMIGKDYKMIC-QSGYSKRSHNFYQTLLMFVLVYYVDIFV 173 
20 F L ++IF F + ++ DY + G + T + +++ Y + 

Sbjct: 119 DYSFIWLC-IIIENFFISLAFLKWLDYDFTSLRKGILDKDFQKSLTQINWIMGAYYLVIQ 177 

Query: 174 IIX3FTDPFIiHFHHSLFVPTPYKI.LFIOTILLLVYLLSyFNHSSKEYLK]ffiIiRREQQAYMT 233 
L + + + + T L+ + ++L + ++ + K+L L +EQ 
25 Sbjct: 178 NLSYFE YEQGIQSTTVRHLILVFYLLFFMGIIKKLDTYLKDKLHERLNQEQDLRYR 233 

Query: 234 NLEIYGKHLEKLYRDVRAFQSDYLSRIERLGQAIKSESITQIQDIYAQTVHEftNDYWDDK 293 

+E Y +H+E+LY++VR+P+ DY + + L 1+ E + QI++IY + ++++ D 
Sbjct: 234 EMERYSRHIEELYKEVRSFRHDYTNIiLTSLRLGIEEEDMEQIKEIYDSVLKDSSEKLQDN 293 

30 

Query: 294 HYNISKLRKINISSIKSLLSAKIISAEKSGIDIJIVEVPDNIKETYIPEIJDLLLLMSIFCD 353 . 

Y++ +L + ++KSLIi+ K I A I NVEVP+ 1+ +. LD L ++SI CD 
Sbjct: 294 KYDlfiRL\nSIVRimiiKSLIjAGKFIKaRDKNIVENVEVPEEIQ\7EGVSLIiDFLT^ 353 

35 Query: 354 NAIEAALEAQQPHMSIAYFLLGDYQMFWTKTTKKK-VDINKIFEEGYSSKGSERGIGLS 412 

NAIEA++EA QPH+SIA+F G + F++ N+ K++ +DI++IF G SSKG ERG+GL 
Sbjct: 354 NRlEASVEaCQPHVSIAFFKNGRQETFIIENSIKEEGIDISEIFSFGASSKGEERGVGLY 413 

Query: 413 NAQRILKKYPYLSLRTKSFDKEFSQTLTM 441 
40 +I++ +P SL T D P Q LT+ 

Sbjct: 414 TVMKIVESHPNTSLNTTCQDHVFRQVLTV 442 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

45 Example 2438 

A DNA sequence (GASx242R) was identified in S.pyogenes <SEQ ID 7375> which encodes the amino acid 
sequence <SEQ ID Til6>. Analysis of this protein sequence reveals the following: 

Possible site: 26 

50 »> Seems to have no N-terminal signal sequence 

, Final Results 

bacterial cytoplasm — Certainty=0 . 4165 (Affirmative) < suco 

bacterial membrane Certainty^^O. 0000 (Not Clear) < suco 

55 bacterial outside — Certainty^O. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 



The protein has no significant homology with any sequences in the GENPEPT database. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2439 

A DNA sequence (GASx243) was identified in S.pyogenes <SEQ ID 7377> which encodes the amino acid 
sequence <SEQ ID 7378>. Analysis of this protein sequence reveals the following: 

Possible site: 26 

»> Seems to have an imcleavable N-term signal seq 



INTEGRAL 


Likelihood 




■11. 


.09 


Transmembrane 


188 


- 204 


( 


182 


- 208) 


INTEGRAL 


Likelihood 




-7. 


.17 


Transmembrane 


52 


- 68 


( 


47 


- 69) 


INTEGRAL 


Likelihood 




-4, 


.73 


Transmembrane 


119 


- 135 


( 


114 


- 142) 


INTEGRAL 


Likelihood 




-4, 


.62 


Transmembrane 


83 


- 99 


( 


77 


- 107) 


INTEGRAL 


Likelihood 




-1, 


.86 


Transmembrane 


328 


- 344 


( 


328 


- 345) 


INTEGRAL 


Likelihood 




-1, 


.65 


Transmetnbrane 


7 


- 23 


( 


6 


- 23) 


INTEGRAL . 


Likelihood 




-0. 


.22 


Transmembrane 


35 


- 51 


( 


35 


- 51) 



Final Results 

bacterial membrane Certainty=0. 543 7 (Affirmative) < suco 

bacterial outside Certaintys:0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAClbl75 GB:AJ278302 histidine kinase [Streptococcus pneumoniae] 
Identities = 123/438 (28%) , Positives = 229/438 (52%) , Gaps = 49/438 (11%) 



Query: 


20 


VIFAKVSAIKLSWKRVS IIGISFVIANMIFDKVIL- - -IDQLFFIIVSLL- - - 


66 






VIF KV 1 L+++R+ ++ + F + + V L ++ L+ I +S L 




Sbjct: 


19 


VIFFKVDGISLTFERIFKAFLFKILLAVVFGMLGYMVGNVYISYFMEPLYGIGLSFLLLR 


78 


Query: 


67 


SAPKKKLFEHMFNGFFTILIVELLFRVIGSFFLPAVLGFSIGQINNNLKLLELCYLFVLP 


126 






PKK L +F G F +++V L +R + F LP + GQ+ ++ + LC + + 




Sbjct: 


79 


ELPKKLL LFYGLFPMILVNLFYRGVSYFVLPFL- - -GQGQVYDDYSFIWLC-IIIFN 


131 


Query: 


127 


I FYLFSYI FS IDL SLIRFI SEDKMKKWVFWMNTAMFSYYFFAHFLVTVQSGFLALYF 


183 






F +++ +D SL + I + +K + +N M +YY L YF 




Sbjct: 


132 


FPISLAFLKWLDYDFTSLRKGILDKDFQKSLTQINWIMGAYYLVIQNLS YF 


182 


Query: 


184 


QY RSILVFIYLAIFIWVIVKLDRFAKDQLSQKLTQAQNERIAYLENYNQSI 


234 






+Y R +++ YL F+ +1 KLD + KD+L ++L Q Q+ R +E y++ I 




Sbjct: 


183 


EYEQ6IQSTTVRHL1LVFYLLFFMGIIKKLDTYLKDKLHERLNQEQDLRYREMERYSRHI 


242 


Query: 


235 


EQLYREIRTVKHDSENILISLKDSIDSGDIDLITRVYDTVIQQSATSMMRTNYEISSLDN 294 






E+LY+E+R+ +HD N+L SL+ 1+ D++ I +YD+V++ S+ + Y++ L N 




Sbjct: 


243 


EELYKETOSFRHDYTNLLTSLRLGIEKEIMEQIKEIYDSVLKDSSEKLQnNKYDLGRLVN 302 


Query: 


295 


IKEAVIRSIMNSKLLEAQYLGIELYIEIPDVIDHLPIKLIDLIVLFTGLVDNAIETAKGS 


354 






+++ ++S++ K ++A+ I +E+P+ I + L+D + + + L DNAIE + + 




Sbjct: 


303 


VRDRALKSLLAGKFIKARDKNIVFNVEVPEEIQVEGVSLLDFLTWSILCDNAIEASVEA 362 


Query: 


355 


RRPFLSIAYFKQDNKQLFIIENSTKTNRVDIAKRFDAQQQNSAH FLTVLDSY 


406 






+P +SIA+FK ++ FIIENS K +DI++ F + + +++S+ 




Sbjct; 


363 


CQPHVSIAFFKNGAQETFIIENSIKEEGIDISEIFSFGASSKGEERGVGLYTVMKIVESH 422 


Query: 


407 


PQITLSTKSDHYRLRQLL 424 








P +L+T + RQ+L 




Sbjct: 


423 


ENTSLNTTCQDHVFRQVL 440 





Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 
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Example 2440 

A DNA sequence (GASx248) was identified in S.pyogenes <SEQ ID 7379> which encodes the amino acid 
sequence <SEQ ID 7380>. Analysis of this protein sequence reveals the following: 

Possible site: 32 

5 

>>> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certaintyi=0. 5665 (Affirmative) < suco 

10 bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — CertaintyssO . 0000 (Not Clear) < suco 

No coiresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

15 Based , on this analysis, it was predicted that this GAS-spedfic protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2441 

A DNA sequence (GASx255) was identified in S.pyogenes <SEQ ID 7381> which encodes the amino acid 
sequence <SEQ ID 7382>. Analysis of this protein sequence reveals the following: 

20 Possible site: 19 

»> Seems to have no N-terminal signal sequence 

Pinal Results 

25 bacterial cytoplasm Certainty=0.1437 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

30 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2442 

A DNA sequence (GASx270R) was identified in S.pyogenes <SEQ ID 7383> which encodes the; amino acid 
35 sequence <SEQ ID 7384>. Analysis of this protein sequence reveals the following: 

Possible site: 21 



40 



45 



»> Seems to have no N-terminal signal sequence 

IlilTEGRALi Likelihood = -5.89 Transmembrane 20 - 36 ( 17 - 36) 



Pinal Results 

bacterial membrane — Certainty=0. 3357 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not ciear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S. agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2443 

A DNA sequence (GASx272) was identified in S.pyogenes <SEQ ID 7385> which encodes the amino acid 
sequence <SEQ ID 7386>. Analysis of this protein sequence reveals the following: 

Possible site: 58 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certaintyi=0. 2488 (Affirmative) < suco 

bacterial membrane — CertaintY=0 . 0000 (Not Clear) < suco 

bacterial outside — Certaintyi=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAB11887 GB:K99104 ribosomal protein S7 (BS7) [Bacillus subtilis] 
Identities = 117/156 (75%) , Positives = 139/156 (89%) 

Query: 1 MSRKNQAPKREVLPDPLYNSKIVTRLINRVMI£IGKRGTAATIVmAENAIKEa,TG^ 60 

M RK KR+VLPDP+X1ISK+V+RLIN++M+DGK+G TI+Y +F+ IKE TGNDA+E 
Sbjct: 1 MPRKGPV2UCRDVLPDPI™SKLVSRLINKlylMIr)GKKGKPQTILYKSFDIIKERTG^roAME 60 

Query: 61 VFETAMDNIMPVLEVRARRVGGSNYQVPVEVRPERRTTLGLRWLVNASRARGEHTMKDRL 120 

VFE A+ NIMPVLEV+ARRVGG+NYQVPVEVRPERRTTLGLRWLVN +R RGB TM++RL 
Sbjct: 61 VFEQMiKNIMPVLEVKRRRVGGftNYQVPVEVRPERRTTliGIjRWLVNYJaiLRGEKTMEBRIi 120 

Query: 121 iUCEIMDftflNimSMVKKREDTHKMAEANRAFMFRW 156 

A EI+nRBNNTG»i+VKKREDTHKMREfiN+AFJVH+RW 
Sbjct: 121 ANEILDaftWNTGaAVKKREDTHKmEANKAFAHVRW 156 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2444 

A DNA sequence (GASx274) was identified in S.pyogenes <SEQ ID 7387> which encodes the amino acid 
sequence <SEQ ID 7388>. Analysis of this protein sequence reveals the following: 

Possible site: 61 

>» Seems to have an uncleavable N-term signal seg 

Final Results 

bacterial meitibrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

A related sequence was also identified in GAS <SEQ ID 9095> which encodes the amino acid sequence 
<SEQ ID 9096>. Analysis of this protein sequence reveals the following: 

Possible cleavage site: 52 
»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty= 0.291 (Affirmative) < suco 

bacterial membrane Certainty= 0 . 000 (Not Clear) < suco 

bacterial outside Certainty= Q. 000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

5 Example 2445 

A DNA sequence (GASx275R) was identified in S.pyogenes <SEQ ID 7389> which encodes the amino acid 
sequence <SEQ ID 7390>. Analysis of this protein sequence reveals the following: 

Possible site: 16 

10 »> Seems to have no N-terminal signal sequence 

._ Final Results 

bacterial cytoplasm Certainty=0. 5664 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

15 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
20 antigens for vaccines or diagnostics. 

Example 2446 

A DNA sequence (GAS3i283) was identified in S.pyogenes <SEQ ID 739 1> which encodes the amino acid 
sequence <SEQ ID 7392>. Analysis of this protein sequence reveals the following: 

Possible site: 18 

25 

»> Seems to have no N-terminal signal sequence 

Pinal Results 

bacterial cytoplasm Certainty=0 . 0724 (Affirmative) < suco 

30 bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certaintyi=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

35 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2447 

A DNA sequence (GASx298) was identified in S.pyogenes <SEQ ID 7393> which encodes the amino acid 
sequence <SEQ ID 7394>. Analysis of this protein sequence reveals the following: 

40 Possible site: 25 

»> Seems to have no N-terminal signal sequence 

, Pinal Results 

45 bacterial cytoplasm — Certainty=0. 2840 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 
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10 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology witli any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2448 

A DNA sequence (GASxSOO) was identified in S.pyogenes <SEQ ID 7395> which encodes the amino acid 
sequence <SEQ ID 7396>. Analysis of this protein sequence reveals the following: 

Possible site: 18 

»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -1.91 Transmembrane 4 - 20 ( 4 - 20) 



Fixial Results 

15 bacterial membrane Certainty=0 . 1765 (Affirmative) < suco 

bacterial outside — Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm — CertaintY=0 , 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

20 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2449 

A DNA sequence (GASx301) was identified in S.pyogenes <SEQ ID 7397> which encodes the amino acid 
25 sequence <SEQ ED 7398>. Analysis of this protein sequence reveals the foUoTving: 

Possible site: 33 

»> Seems to have no N-terminal signal sequence 

30 Final Results 

bacterial cytoplasm — Certainty=0. 4884 (Affirmative) < suco 

bacterial menibrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0 . 0000 (Not Clear) < suco 

35 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2450 

40 A repeated DNA sequence (GASx302) was identified in S.pyogenes <SEQ ID 7399> which encodes the 
amino acid sequence <SEQ ID 7400>. Analysis of this protein sequence reveals the following: 

Possible site: 22 

»> Seems to have no N-terminal signal sequence 

45 

Final Results 
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bacterial cytoplasm Certainty=0. 2581 (Affirmative) < suco 

bacterial membrane CertaantY=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

5 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology witii any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useftil 
antigens for vaccines or diagnostics. 

Example 2451 

10 A DNA sequence (GASx316) was identified in S.pyogenes <SEQ ID 7401> which encodes the amino acid 
sequence <SEQ ID 7402>. Analysis of this protein sequence reveals the following: 

Possible site: 18 

>» Seems to have no N-terminal signal sequence 
15 INTEGRAL Likelihood = -0.80 Transtnettibrane 23 - 39 ( 22 - 39) 

Final Results 

bacterial membrane Certainty=0. 1319 (Affirmative) < euco 

bacterial outside Certainty=0. 0000 (Not Clecuc) < suco 

20 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
25 antigens for vaccines or diagnostics. 

Example 2452 

A DNA sequence (GASx323R) was identified in S.pyogenes <SEQ ID 7403> which encodes the amino acid 
sequence <SEQ ID 7404>. Analysis of this protein sequence reveals the following: 

Possible site: 28 

30 

>■» Seems to have no N-terminal signal sequence 

-- — Final Results 

bacterial cytoplasm Certainty=0 . 0005 (Affirmative) < suco 

35 ' bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certaintyi=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in iS.flg^a/acft'flg. 

The protein has no significant homology with any sequences in the GENPEPT database. 

40 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2453 

A DNA sequence (GASx334) was identified in S.pyogenes <SEQ ID 7405> which encodes the amino acid 
sequence <SEQ ID 7406>. Analysis of this protein sequence reveals the following: 

45 Possible site: 17 
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»> Seems to have a cleavable N-term signal seq. 

Final Results 

bacterial outside Certainty=0. 3000 (Affirmative) < suco 

5 bacterial membrane — Certainty= 0.0000 (Not Clear) < suco 
bacterial cytoplasm CertaintyteO. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

10 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2454 

A DNA sequence (GASx336) was identified in S. pyogenes <SEQ ID 7407> which encodes the amino acid 
sequence <SEQ ID 7408>. Analysis of this protein sequence reveals the following: 

15 Possible site: 31 

»> Seems to have no N-terminal signal sequence 

Final Results 

20 bacterial cytoplasm Certainty=0. 3379 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco , 

No corresponding DNA sequence was identified in S.agalactiae. 

25 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2455 

A DNA sequence (GASx361R) was identified in S.pyogenes <SEQ ID 7409> which encodes the amino acid 
30 sequence <SEQ ID 741 0>, Analysis of this protein sequence reveals the following: 

Possible site: 22 

»> Seems to have no N-terminal signal sequence 

35 Pinal Results 

bacterial cytoplasm Certainty=0 . 2807 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=o . 0000 (Not Clear) < suco 

40 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2456 

45 A DNA sequence (GASx387) was identified in S.pyogenes <SEQ ID 741 1> which encodes the amino acid 
sequence <SEQ ID 7412>. Analysis of this protein sequence reveals the following: 
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Possible site: 16 



>» Seems to have no N-tertninal signal sequence 



5 Final Results 

bacterial cytoplasm — Certainty=0 . 2740 (Affirmative) < suco 

bacterial metribrane — Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside CertaintY=0. 0000 (Not Clear) < suco 

10 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-spedfic protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 24S7 

15 A DNA sequence (GASx389) was identified in S.pyogenes <SEQ ID 7413> which encodes the amino acid 
sequence <SEQ ID 7414>. Analysis of this protein sequence reveals the following: 

Possible site: 21 



20 



25 



>» Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm — Certainty=o . 0744 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

30 Example 2458 

A DNA sequence (GASx392) was identified in S.pyogenes <SEQ ID 7415> which encodes the amino acid 
sequence <SEQ ID 7416>. Analysis of this protein sequence reveals the following: 

Possible site: 29 

35 »> Seems to have no N-terminal . signal sequence 

Final Results 

bacterial cytoplasm Certainty^O. 2162 (Affirmative) < suco 

bacterial membrane — Certaintys=0. 0000 (Not Clear) < suco 
40 bacterial outside — Certainty^o. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
45 antigens for vaccines or diagnostics. 
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Example 2459 

A DNA sequence (GASx393R) was identified in S.pyogenes <SEQ ID 741 7> which encodes the amino acid 
sequence <SEQ ID 741 8>. Analysis of this protein sequence reveals the following: 

Possible site: 18 

5 

>>> Seems to have no N-terrainal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 2520 (Affirmative) < suco 

10 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

15 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2460 

A DNA sequence (GASx395) was identified in S.pyogenes <SEQ ID 7419> which encodes the amino acid 
sequence <SEQ ID 7420>. Analysis of this protein sequence reveals the following: 

20 Possible site: 16 

>» Seems to have no N-teirminal signal sequence 

Final Results 

25 bacterial cytoplasm Certainty=0. 2590 (Affirmative) < suco 

bacterial membrane Certaintyi=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in XflgaZacftae. 

30 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2461 

A DNA sequence (GASx396) was identified in S.pyogenes <SEQ ID 7421> which encodes the amino acid 
35 sequence <SEQ ID 7422>. Analysis of this protein sequence reveals the following: 

Possible site: 41 

»> Seems to have an iincleavable N-term signal seq 

40 Pinal Results 

bacterial tneinbrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

45 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAB13373 GB:Z99111 similar to hypothetical proteins [Bacillus subtilis] 
Identities = 23/88 (26%) , Positives = 52/88 (58%) 
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Query: 4 KQERlGLVVYLYXWRDARKLSKFGDLyraSKRSRYLIIYINKNDLDTKLEEMRRLKCVKD 63 

+ R G+WYL+ + ++ L KFG+++Y SKR +Y+++Y + + ++ ++++ VK 
Sbjct: 2 EaNRRQGIWVYLHSLKQSKMiRKFGimmrSKRLKYVVLYCDMDQI^ 61 

5 

Query: 64 IRPSAFDDIDRQFVGNLHRDETNNHQRG 91 

+ PS + +P Ii + + +++ G 
Sbjct: 62 VEPSYKPFLKIiEFESKLDKiyKEYDYKIG 89 

10 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefijl 
antigens for vaccines or diagnostics. 

Example 2462 

A DNA sequence (GASx400) was identified in S.pyogenes <SEQ ID 7423> which encodes the amino acid 
sequence <SEQ ID 7424>. Analysis of this protein sequence reveals the following: 

15 Possible site: 13 

»> Seems to have no N-temiinal signal sequence 

Final Results 

20 bacterial cytoplasm Certainty=0 . 2010 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

25 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useftil 
antigens for vaccines or diagnostics. 

Example 2463 

A DNA sequence (GASx401) was identified in S.pyogenes <SEQ ID 7425> which encodes the amino acid 
30 sequence <SEQ ID 7426>. Analysis of this protein sequence reveals the following: 

Possible site: 17 

>>> Seems to have no N-terminal signal sequence 

35 Final Results 

bacterial cytoplasm Certainty=0 . 1176 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco ' 

bacterial outside Certainty=0'. 0000 (Not Clear) < suco 

40 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2464 

45 A DNA sequence (GASx402) was identified in S.pyogenes <SEQ ID 7427> which encodes the amino acid 
sequence <SEQ ID 7428>. Analysis of this protein sequence reveals the following: 

Possible site: 16 
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»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 2938 (Affirmative) < suco 

5 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

10 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2465 

A DNA sequence (GASx403R) was identified in S.pyogenes <SEQ ID 7429> which encodes the amino acid 
sequence <SEQ ID 743 0>. Analysis of tliis protein sequence reveals the following: 

15 Possible site: 21 ' 

>» Seems to have a cleavable N-term signal seq. 

Final Results 

20 bacterial outside Certainty=0 . 3000 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

25 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2466 

A DNA sequence (GASx406) was identified in S.pyogenes <SEQ ID 743 1> which encodes the amino acid 
30 sequence <SEQ ID 7432>. Analysis of this protein sequence reveals the following: 
Possible site: 31 

»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood =-12.26 Transmembrane 15 - 31 ( , 4 - 36) 
35 INTEGRAL Likelihood = -6.64 Transmembrane 96 - 112 ( 94 - 115) 

Final Results 

bacterial membrane Certainty=0. 5904 (Affirmative) < suco 

bacterial outside Certaintyi=0.0000 (Not Clear) < suco 

40 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
45 antigens for vaccines or diagnostics. 
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Example 2467 

A DNA sequence (GASx408R) was identified in S.pyogenes <SEQ ID 7433> which encodes the amino acid 
sequence <SEQ ID 7434>. Analysis of this protein sequence reveals the following: 

Possible site: 19 

5 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -2.23 Transmenibrane 17 - 33 ( 15 - 34) 
INTEGRAL Likelihood = -0.85 Transmembrane 38 - 54 ( 38 - 54) 

10 Pinal Results 

bacterial membrane Certainty=0 .1893 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefhl 
antigens for vaccines or diagnostics. 

Example 2468 

20 A DNA sequence (GASx412) was identified in S.pyogenes <SEQ ID 7435> which encodes the amino acid 
sequence <SEQ ID 7436>. Analysis of this protein sequence reveals the following: 

Possible site: 13 

»> Seems to have an uncleavable N-term signal seq 
25 INTEGRAL Likelihood = -6.53 Transmembrane 5 - 21 ( 4-23) 

Final Results 

bacterial membrane Certaintyi=0. 3612 (Affirmative) < suco 

bacterial outside — Certaintyi=0. 0000 (Not Clear) < suco 

30 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
35 antigens for vaccines or diagnostics. 

Example 2469 

A DNA sequence (GASx413) was identified in S.pyogenes <SEQ ID 7437> which encodes the amino acid 
sequence <SEQ ID 7438>. Analysis of this protein sequence reveals the following: 

Possible site: 56 

>» Seems to have no N-terminal signal sequence 



40 



Final Results 

bacterial cytoplasm Certainty=0 . 3422 (Affirmative) < suco 

45 bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 
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>GP:(aA68903 GB:Y07622 lactate oxidase [Streptococcus iniae] 
Identities = 328/392 (83%) , Positives = 359/392 (90%) , Gaps = 4/392 (1%) 

Query: 3 MAQKTVITEETTDFVMDFKTSSAEGNVDFINVFDLEKMAQQVIPKGAFGYIASGAGDTFT 62 

M K+ + TT ++Fl(CrSSREG+VDF+NVFDIiEKMAQ+VIPKGRFGYIASGftGDTFT 
Sbjct: 1 MENKSEMINATT---IEPKTSSAEGSVDFV]WFDLEKMftQKVIPKGAFGyiM 57 

Query: 63 LHmiRSFNHKLlVPHSLKGVBNPSTEITFDGDYLTSPLILAPVaAHKIAlffiQGEV^^ 122 

LHENIRSFNHKLI PH LK6VEWPSTEITF GD h SP+ILAPVAftHKLaNEQGE+AS2«C 
Sbjct: 58 LHENIRSFffiJKLI-PHGLKGVENPSTEITFIGDKLASPIILAPVaaHKLANEQGEIASAK 116 

Query: 123 GLKEFGSIYTTSSYSTTDLPEISAALGGTPHWFQFYYSKDDGINENIMDRVKRQGCKAIV 182 

G+KEFG+IYTTSSYSTTDLPEIS LG +PHWFQFYYSKDDGINR+IMDR+KA+G K+IV 
Sbjct: 117 GVKEFGTIYTTSSYSTTDLPEISQTLGDSPHWFQFYYSKDDGINRHIMDRLKAEGVKSIV 176 

Query: 183 LTADAOTGGITOEVDRRNGFVFPVGMPIVQEYLPDGftGKriTOYVyKSAKQftLTSKDIEYIA 242 

LT DATVGGNREVD+RNGPVFPVGMPIVQEYLP+GAGKrMDYVYK+ KQfiL+ KD+EYIA 
Sbjct: 177 nTTOATVGGNREVDKKNGFWPVGMPIVQEYLENGAGKTiroYVYKRTKQALSPKD 236 

Query: 243 TYSGLPVYVKGPQCAEDTLRALDAGASGIWVTNHGGRQLDGGPAAFDSLQEVAEAVDQKV 302 

YSGLPVYVKGPQCJiED RAL+AGASGIWVTNHGGRQLDGGPAAFDSLQEVAE+VD++V 
Sbjct: 237 QYSGLPVYVKGPQCJfflDAPRALEAGRSGIWrNHGGRQLDGGPAAFDSLQEVAESVDRRV 296 

Queiy: 303 PIVFDSGIRRGQHIFKALASGADLVALGRPAIYGIAMGGSIGTRQVFEKLNDELKMVMQL 362 

PIVFDSG+RRGQH+FKALASGADLVALGRP lYGLAMGGS+GTRQVFEK+NDELKMVMQL 
Sbjct: 297 PIVFDSGVRRGQHVFKALASGADLVALGRPVIYGLAMGGSVGTRQVFEKINDELKMVMQI, 356 

Query: 363 AGTQTIQDVKAKNLRHNPYDSSIPFDQNALRL 394 

AGTQTI DVK F LRHNPYDSSIPF ++ 
Sbjct: 357 AGTQTIDDVKHFKIiRHNPYDSSIPFSPKCFKl 388 

Based on this analysis, it was predicted that this GAS-specrific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2470 

A DNA sequence (GASx414) was identified in S. pyogenes <SEQ ID 7439> which encodes the amino acid 
sequence <SEQ ID 7440>. Analysis of this protein sequence reveals the following: 

Possible site: 32 

>» Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 0682 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2471 

A DNA sequence (GASx417R) was identified in S.pyogenes <SEQ ID 7441> which encodes the amino acid 
sequence <SEQ ID 7442>. Analysis of this protein sequence reveals the following: 

Possible site: 34 

>» Seems to have no N-terminal signal sequence 
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Final Results 

bacterial cytoplasm Certainty=0 . 1765 (Affirmative) < suco 

bacterial membrane Certaintyi=0 . 0000 (Not Clear) < suco 

bacterial outside — Certaiiity=0.O0OQ (Not Clear) < suco 

5 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

10 Example 2472 

A DNA sequence (GASx418) was identified in S. pyogenes <SEQ ID 7443> which encodes the amino acid 
sequence <SEQ ID 7444>. Analysis of this protein sequence reveals the following: 

Possible site: 32 

15 »> Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 2532 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

20 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
25 antigens for vaccines or diagnostics. 

Example 2473 

A DNA sequence (GASx419) was identified in S.pyogenes <SEQ ID 7445> which encodes the amino acid 
sequence <SEQ ID 7446>. Analysis of this protein sequence reveals the following: 

Possible site: 28 



30 



>» Seems to have no N- terminal signal sequence 



Pinal Results 

bacterial cytoplasm Certainty=0 . 3082 (Affirmative) < suco 

35 bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

40 Based on this analysis, it was predicted that fliis GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2474 

A DNA sequence (GASk423) was identified in S^pyogenes <SEQ ID 7447> which encodes the amino acid 
sequence <SEQ ID 7448>>.^alysis of this protein sequence reveals the following: 

45 Possible site: 52 



wo 02/34771 



PCT/GBOl/04789 



20 



25 



50 



-2661- 

>>> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -2.18 Transmetnbrane 14 - 30 ( 13 - 31) 



Final Results 

5 bacterial membrane Certainty=0. 1871 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintyi= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

1 0 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-stpecific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2475 

A DNA sequence (GASx427R) was identified in S.pyogenes <SEQ ID 7449> which encodes the amino acid 
15 sequence <SEQ ID 7450>. Analysis of this protein sequence reveals the following: 

Possible site: 25 



»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -1.17 Transmembrane 13 - 29 (' 10 - 29) 

Final Results 

bacterial membrane Certaintyi=0. 1468 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0.0000(Not Clear) < suco 

A related sequence was also identified m GAS <SEQ ID 9105> which encodes the amino acid sequence 
<SEQ ID 9106>. Analysis of this protein sequence reveals the following: 

Possible site: 20 
»> Seems to have an uncleavable N-term signal seq 
30 INTEGRAL Likelihood = -1.17 Transmembrane 8-24 

Final Results 

bacterial membrane Certainty=0 . 1470 (affirmative) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

35 bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAA26616 GB:M63917 epidermal cell differentiation inhibitor 
40 [Staphylococcus aureus] 

Identities = 58/195 (29%) , Positives = 106/195 (53%) , Gaps = 13/195 (6%) 

Query: 67 RWGKGLI YPRAEQEAMAAYTCQQAGPINTSLDKAKGELSQLTPELRDQVAQLDAAT 122 

+WG LI Y ++ A+ YT + + IN L A G++++L +D+V +LD++ 
45 Sbjct: 49 KWGHKLIKQAKYSSDDKXALYEYT-KDSSKINGPLRLAGGDINKLDSTTQDKVRRLDSSI 107 

Query: 123 HRLVIPWNIVVYRyVYETFLRDI-GVSHADLTSYYR--NHQFDPHILCKIK--LGTR-YT 176 

+ P ++ VYR + +L I 6 ++ DL + N Q+D +++ K+ + +R Y 
Sbjct: 108 SKSTTPESVYVYRLIiOTiDYLTSIVGFTNEELYKLQQMGQYDENLVRKLN^^ 167 



Query: 177 KHSEMSTTALKNGAMTHRPVEVRICVKKGAKAAFV--EPYSAVPSEVELLFPRGCQLEVV 234 

+ + ST + A+ RP+E+R+ + KG KAA++ + +A + E+L PRG + V 
Sbjct: 168 EDGYSSTQLVSGAAVGGRPIELRLELPKGTKAAYLNSKDLTAYYGQQEVLLPRGTEYAVG 227 



55 



Query: 235 GAYVSQDQKKLHIEA 249 
+S D+KK+ I A 
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Sbjct: 228 SVELSNDKKKIIITA 242 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefUl 
antigens for vaccines or diagnostics. 

5 Example 2476 

A DNA sequence (GASx428) was identified in S.pyogenes <SEQ ID 745 1> which encodes the amino acid 
sequence <SEQ ID 7452>. Analysis of this protein sequence reveals the following: 

Possible site: 14 

10 »> Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 .3817 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

15 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
20 antigens for vaccines or diagnostics. 

Example 2477 

A DNA sequence (GASx429) was identified in S.pyogenes <SEQ ID 7453> which encodes the amino acid 
sequence <SEQ ID 7454>. Analysis of this protein sequence reveals the following: 

Possible site: 32 

25 

>>> Seems to have a cleavable N-term signal seq. 

Final Results 

bacterial outside Certainty=0. 3000 (Affirmative) < suco 

30 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty^O. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

35 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2478 

A DNA sequence (GASx431) was identified in S.pyogenes <SEQ ID 7455> which encodes the amino acid 
sequence <SEQ ID 7456>. Analysis of this protein sequence reveals the following: 

40 Possible site: 43 

>» Seems to have an imcleavable N-term signal seq 

INTEGRAL Likelihood = -8.60 Transmembrane 68 - 84 ( 66 - 90) 

INTEGRAL Likelihood = -6.85 Transmembrane 22 - 38 ( 16 - 42) 

45 INTEGRAL Likelihood = -3.29 Transmembrane 44 - 60 ( 43 - 61) 

Final Results 

bacterial membrane Certainty^O. 4439 (Affirmative) < suco 
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bacterial outside — Certaiiity=0 . 0000 (Not Clear) < suco 
bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

5 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2479 

A DNA sequence (GASx432R) was identified in S.pyogenes <SEQ ID 7457> which encodes the amino acid 
10. sequence <SEQ ID 7458>. Analysis of this protein sequence reveals the following: 

Possible site:i 22 

»> Seems to have a cleavable N-term signal seq. 

15 Final Results 

bacterial outside Certainty=0 . 3000 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

20 No corresponding DNA sequence was identified in S.aga/acriae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2480 

25 A DNA sequence (GASx434) was identified in S.pyogenes <SEQ ID 7459> which encodes the amino acid 
sequence <SEQ ID 7460>. Analysis of this protein sequence reveals the following: 

Possible site: 24 

>>> Seems to have a cleavable N-term signal seq. 

30 

Filial Results 

bacterial outside Certainty=0. 3000 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty^O . 0000 (Not Clear) < suco 

35 

No corresponding DNA sequence was identified xsi S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GASrspecific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

40 Example 2481 

A DNA sequence (GASx435R) was identified in S.pyogenes <SEQ ID 7461> which encodes the amino acid 
sequence <SEQ ID 7462>. Analysis of this protein sequence reveals the following: 

Possible site: 25 

45 »> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -2.50 Transmembrane 4 - 20 ( 3 - 21) 
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Final Results 

bacterial membrane Certainty=0 . 1999 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

5 bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAB59092 GB:M97157 pyrogenic exotoxin C [Streptococcus pyogenea] 
10 Identities = 110/229 (48%), Positives = 150/229 (65%), Gaps = 4/229 (1%) 

Query: 4 IIKTII]liVIIIFHGYGS--VKSDSE-NIKDVKIjQLNYAYEIIPVDYTNCNIDyLTTHDFY 60 

IIK + ++ +1 S +KSDS+ +1 +VK L YAY I P DY +C +++ TTH 

Sbjct: 6 IIKIWIITVILISTISPIIKSDSKKDISNVKSDLLYAYTITPYDYKDCRVNFSTTHTLN 65 

15 

Query: 61 IDISSYKKKNFSVDSEVESYITTKFTKNQKVNIFGLPYIFTRYDVYYIYGGVTPSVNSNS 120 

ID Y+ K++ + SE+ + KF ++ V++FGL YI + YIYGG+TP+ N N 
Sbjct: 66 IDTQKYRGKDYYISSEMSYEASQKFKRDDHVDVFGLFYIENSHTGEYIYGGITPAQN-NK 124 

20 Query: 121 ENSKIVGNLLIDGVQQKTLINPIKIDKPIFTIQEFDFKIRQYLMQTYKIYDPNSPYIRGQ 180 

N K++GNL I G Q+ L N I ++K I T QE DFKIR+YLM YKIYD SPY+ G+ 
Sbjct: 125 VNHKLLGNLFISGESQQNLNNKIILEKDIVTFQEIDFKIRKYLMDNYKIYDATSPYVSGR 184 

Query: 181 LEIAINGNKHESBmjYDATSSSTRSDIFKKYKDNKTINMKDFSHFDIYL 229 
25 +EI KHE +IJ+D+ + TRSDIF KYKDN+ INMK+FSHFDIYL 

Sbjct: 185 lEIGTKDGKHEQIDLFDSENEGTRSDIFAKYKDNRIINMKNFSHFDIYL 233 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

30 Example 2482 

A DNA sequence (GASx436R) was identified in S. pyogenes <SEQ ID 7463> which encodes the amino acid 
sequence <SEQ ID 7464>. Analysis of this protein sequence reveals the following: 

Possible site: 22 

35 »> Seems to have a cleavable N-term signal seq. 

Final Results 

bacterial outside — Certaintyi=0. 3000 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

40 bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
45 antigens for vaccines or diagnostics. 

Example 2483 

A DNA sequence (GASx446) was identified in S.pyogenes <SEQ ID 7465> which encodes the amino acid 
sequence <SEQ ID 7466>. Analysis of this protein sequence reveals the following: 

Possible site: 20 



50 



»> Seems to have a cleavable N-term signal seq. 
Final Results 
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bacterial outside Certainty=0 .3000 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certaiiity=0 . 0000 (Not Clear) < suco 

5 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2484 

10 A DNA sequence (GASx449) was identified in S.pyogenes <SEQ ID 7467> which encodes the amino acid 
sequence <SEQ ID 7468>. Analysis of this protein sequence reveals the following: 

Possible site:- 15 

>>> Seems to have an uncleavable N-term signal seq 
15 INTEGRAL Likelihood = -3.82 Transmembrane 3 - 19 ( 1-20) 

Final Results 

bacterial membrane Certainty=0. 2529 (Affirmative) < suco . 

bacterial outside — Certainty= 0.0000 (Not Clear) < suco 

20 bacterial cytoplasm — Certaxnty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
25 antigens for vaccines or diagnostics. 

Example 2485 

A DNA sequence (GASx450R) was identified in S.pyogenes <SEQ ID 7469> which encodes the amino acid 
sequence <SEQ ID 7470>. Analysis of this protein sequence reveals the following: 

Possible site: 30 



30 



»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -1.44 Transmembrane 21 - 37 ( 19 - 37) 



Final Results 

35 bacterial membrane Certainty=0 . 1574 (Af f irtnatiye) <: suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

40 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigeiis for vaccines or diagnostics. 

Example 2486 

A DNA sequence (GASx457R) was identified in S.pyogenes <SEQ ID 747 1> which encodes the amino acid 
45 sequence <SEQ ID 7472>. Analysis of this protem sequence reveals tiie following: 

Possible site: 19 
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»> Seems to have a cleavable N-term signal seq. 

INTEGRTiL Likelihood =-15.34 Transmembrane 64 - 80 ( 57 - 86) 

INTEGRAL Likelihood =-13.43 Transmembrane 97 - 113 ( 91 - 116) 

5 INTEGRAL Likelihood = -5.57 Transmembrane 38 - 54 ( 32 - 56) 

Final Results 

bacterial membrane — Certainty=0. 7135 (Affirmative) < suco 

bacterial outside — Certainty= 0.0000 (Not Clear) < suco 

10 bacterial cytoplasm — Certainty= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GEMPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
15 antigens for vaccines or diagnostics. 

Example 2487 

A DNA sequence (GASx476R) was identified in S.pyogenes <SEQ ID 7473> which encodes the amino acid 
sequence <SEQ ID 7474>. Analysis of this protein sequence reveals the following: 

Possible site: 31 



20 



»> Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0 .3 013 (Affirmative) < suco 

25 ■ bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

30 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2488 

A DNA sequence (GASx477) was identified in S.pyogenes <SEQ ID 7475> which encodes the amino acid 
sequence <SEQ ID 7476>. Analysis of this protein sequence reveals the following: 

35 Possible site: 57 

»> Seems to have no N-terminal signal sequence 

Final Results 

40 bacterial cytoplasm — Certainty=0. 1022 (Affirmative) < suco 

bacterial membrane — Ce'rtainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

45 The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAC03521 GB:AJ276410 BlpJ protein [Streptococcus pneumoniae] 
Identities = 47/77 (61%) , Positives = 59/77 (76%) 

Query: 1 MIKFAEEIQKEELFHIIGGYSATDCKNHLIGGITSG3VlAGGVGAGMATLGVGGVftGAFAG 60 
50 M+ E + E L + GGYS+TDC+N LI G+T+G I GG GAG+ATLGV G+AGAF G 

Sbjct: 5 MLSQLEVMDTEMLAKVEGGYSSTDCQNALITGVTTGIITGGTGAGLATLGVAGLAGAFVG 64 
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Query: 61 AHVGAIAGGLTCVGGML 77 

AH+GAI GGLTC+GGM+ 
Sbjct: 65 AHIGAIGGGLTCLGGMV 81 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useftil 
antigens for vaccines or diagnostics. 

Example 2489 

A DNA sequence (GASx478) was identified in S.pyogenes <SEQ ID 7477> which encodes the amino acid 
sequence <SEQ ID 7478>. Analysis of this protein sequence reveals the following: 

Possible site: 45 

»> Seems to have no N-terminal signal sequence 

INTEGRftL Likelihood = -2.07 Transmembrane 42 - 58 ( 41 - 58) 
INTEGRAL Likelihood = -1.59 Transmembrane 22 - 38 ( 22 - 38) 

Final Results 

bacterial menibrane Certainty=0. 1829 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0.0000(Not Clear) < suco 

No corresponding DNA sequence was identified in iS.aga/acftae, 

The protein has homology with the following sequences in the GENPEPT database; 

>GP:CAC03520 GB:AJ276410 BlpI protein [Streptococcus pneumoniae] 
Identities = 35/56 (62%) , Positives = 44/56 (78%) 

Query: 1 MDNFLELQFEELVNISGGKGNIGSAIGGCLGGMLIAAAGGPITGGAAAFVCVASGI 56 

M+ F + EEL +SGG+GN+GSAIGGC+G +L+AAA GPITGGAA +CV SGI 
Sbjct: 6 MEQFSVMDNEELEIVSGGRGNLGSAIGGCIGRVIiLAAATGPITGGARTLICVGSGI 61 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2490 

A DNA sequence (GASx482) was identified in S.pyogenes <SEQ ID 7479> which encodes the amino acid 
sequence <SEQ ID 7480>. Analysis of this protein sequence reveals the following: 

Possible site: 14 

>» Seems to have an uncleavable N-term signal seq , 

INTEGRAL Likelihood = -0.43 Transmembrane 61 - 77 ( 51 - 79) 

Final Results 

bacterial membrane Certainty=0 . 1171 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agcdactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAC03524 GB:AJ276410 BlpM protein [Streptococcus pneumoniae] 
Identities = 22/52 (42%) , Positives = 30/52 (57%) 



Query: 29 MEIKK1.ETFHQMTIEKIJUCVEGGKNNWQANVSGVIAAGSAGAAIGFPVCGVA 80 

M+ K +E FH+M I L+ +EGGKKNWQ NV A G +G +C + 

Sbjct: 1 MDTKIMEQFHEMDITMLSSIEGGKNNWQTNVLEGGGAAFGGWGLGTAICAAS 52 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2491 

5 A DNA sequence (GASx483) was identified in S.pyogenes <SEQ ID 748 1> which encodes the amino acid 
sequence <SEQ ID 7482>. Analysis of this protein sequence reveals the following: 

Possible site: 58 

>>> Seems to have no N-terminal signal sequence 

10 

Pinal Results 

bacterial cytoplasm Certainty=0. 1832 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

15 

No correspondir^ DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

20 Example 2492 

A DNA sequence (GASx484) was identified in S.pyogenes <SEQ ID 7483> which encodes, the amino acid 
sequence <SEQ ID 7484>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

25 »> Seems to have a cleavable N-term signal seq. 

Final Results 

bacterial outside Certainty=0. 3 000 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

30 bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefial 
35 antigens for vaccines or diagnostics. 

Example 2493 

A DNA sequence (GASx485) was identified in S.pyogenes <SEQ ID 7485> which encodes the amino acid 
sequence <SEQ ID 7486>. Analysis of this protein sequence reveals the following: 

Possible site: 32 

40 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 103 7 (Affirmative) < suco 

45 bacterial membrane Certainty=0. GOOD (Not Clear) < suco 

bacterial outside Certainty^o. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 



wo 02/34771 



-2669- 



PCT/GBOl/04789 



The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2494 

5 A DNA sequence (GASx487) was identified in S.pyogenes <SEQ ID 7487> which encodes the amino acid 
sequence <SEQ ID 7488>. Analysis of this protein sequence reveals the following: 

Possible site: 50 

»> Seems to have no N-termlnal signal sequence 

10 

Final Results 

bacterial cytoplasm Certainty=0 . 1086 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty4= 0.0000 (Not Clear) < suco 

15 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useftil 
antigens for vaccines or diagnostics. 

20 Example 2495 

A DNA sequence (GASx488) was identified in S.pyogenes <SEQ ID 7489> which encodes the amino acid 
sequence <SEQ ID 7490>. Analysis of this protein sequence reveals the following: 

Possible site: 22 

25 >» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2176 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

30 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
35 antigens for vaccines or dij^ostics. 

Example 2496 

A DNA sequence (GASx489R) was identified in S.pyogenes <SEQ ID 7491> which encodes the amino acid 
sequence <SEQ ID 7492>. Analysis of this protein sequence reveals the following: 

Possible site: 22 

40 

>» Seems to have an xjncleavable N-term signal seq 

Final Results 

bacterial membrane CertaintY=0 . 0000 (Not Clear) < suco 

45 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm CertaintY=0 . 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefijl 
antigens for vaccines or diagnostics. 

5 Example 2497 

A DNA sequence (GASx490) was identified in S.pyogenes <SEQ ID 7493> which encodes the amino acid 
sequence <SEQ ID 7494>. Analysis of this protein sequence reveals the following: 

Possible site: 24 

10 >» Seems to have no N- terminal signal sequence 

__ Final Results 

bacterial cytoplasm Certainty=0 .2547 (Affirmative) < suco 

bacterial tnerabrane Certainty= 0.0000 (Not Clear) < suco 

15 bacterial outside Certainty= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, coidd be useful 
20 antigens for vaccines or diagnostics. 

Example 2498 

A DNA sequence (GASx491R) was identified in S.pyogenes <SEQ ID 7495> which encodes the amino acid 
sequence <SEQ ID 7496>. Analysis of this protein sequence reveals the following: 

Possible site: 22 ' 

25 

»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood =-10.24 Transmembrane 6 - 22 ( 3-28) 

Final Results 

30 bacterial membrane Certainty=0. 5097 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintyi=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

35 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2499 

A DNA sequence (GASx492) was identified in S.pyogenes <SEQ ID 7497> which encodes the amino acid 
40 sequence <SEQ ID 7498>. Analysis of this protein sequence reveals the following: 

Possible site: 27 

>>> Seems to have an uncleavable N-term signal seq 

45 Final Results 

bacterial membrane — Certainty=0 . 0000 (Not Clear) < suco 
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bacterial outside Certainty=0, 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2500 

A DNA sequence (GASx493) was identified in S.pyogenes <SEQ ID 7499> which encodes the amino acid 
sequence <SEQ ID 7500>. Analysis of this protein sequence reveals the following: 

Possible site: 19 

>» Seems to have no N- terminal signal sequence 

INTEGRAL Likelihood = -0.69 Transmembrane 21 - 37 ( 21 - 37) 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2501 

A DNA sequence (GASx495R) was identified in S.pyogenes <SEQ ID 7501> which encodes the amino acid 
sequence <SEQ ID 7502>. Analysis of this protein sequence reveals the following: 

Possible site: 28 

>» Seems to have no N- terminal signal sequence 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2502 

A DNA sequence (GASx499R) was identified in S.pyogenes <SEQ ID 7503> which encodes the amino acid 
sequence <SEQ ID 7504>. Analysis of this protein sequence reveals the following: 

Possible site: 15 

»> Seems to have an uncleavable N-term signal seq 



Final Results 



bacterial membrane 
bacterial outside 
bacterial cytoplasm 



- Certainty=0. 1277 (Affirmative) < suco 

- Certainty=0. 0000 (Not Clear) < suco 

- Certainty=0. 0000 (Not Clear) < suco 



Final Results 



bacterial cytoplasm Certaintys=0. 2891 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 
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INTEGRAL Likelihood = -2.50 Transmembrane 3 - 19 ( 1-20) 



Pinal Results 

bacterial membrane Certalnty=0 . 1999 (Affirmative) < suco 

5 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

10 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2503 

A DNA sequence (GASx500) was identified in S.pyogenes <SEQ ID 7505> which encodes the amino acid 
sequence <SEQ ID 7506>. Analysis of this protein sequence reveals the following: 

15 Possible site: 54 

»> Seems to have an uncleavable N-term signal seq 

Pinal Results 

20 bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < succ> 

No corresponding DNA sequence was identified in S.agalactiae. 
25 The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAC77220 GB:AE000497 orf , hypothetical protein [Escherichia 
coli] 

Identities = 262/480 (54%) , Positives = 338/480 (69%) , Gaps = 5/480 (1%) 

30 Query: 18 GMUJRHGrilAGATGTGKTVTLKVEAEQLSIiRGVFWLRDIKGDLSNLTK^ 77 

GM HRH6LI GATGT6KTVTL+ LAE LS GVPVF+AD+KiGDL+ + +AG V++KL AR 
Sbjct: 20 GMRHRHGLITGATGTGKaWI^KLAESLSEIGVPVFMaDVKGDLTGVAQAGTVSEK^ 79 

Query: 78 lATIGVSDYQPQAFPVRMWDVFGQNGQPLRTTISEI/SPMMLSRLLNLNDTQTGVIiNIVFK 137 
35 L IGV+D+QP A PV +WD+FG+ G P+R T+S+LGP++L+RLLNLND Q+GVIjNI+F+ 

Sbjct: 80 LKNIGVNDWQPHftNPVVVWDIPGEKGHPVRATOSDLGPLLLARLIJJJIjroVQSGV™ 139 

Query: 138 lADEKXSWLLIDLKDLQAILKEVGDHASDYSSHyGNIAKQSIGAIQRSLLTLEQEQaHQPF 197 
IAD++G LL+D KDL+AI + +GD+A + + YGNU- S+GAIQR LL+LEQ+GA PF 
40 Sbjct: 140 lADDQGLLLIiDFKDLRAITQYIGnNAKSFQNQYGNISSASVGAlQRGLLSLEQQGAAHPF 199 

Query: 198 GEPALDVADLMQLDVASGYGAINILSATKLFQSPTLYTTFLLWLLSELYKLLPEVGDLDK 257 

GEP IiD+ D M+ D A+G G INILSA KL+Q P LY LLW+LSELY+ LPE GDL+K 
Sbjct: 200 GEP^alDIKDWMRTO-ANGKGVINILSAEKLY(^PKLYAASLLW^aJSELYEQLP^ 258 

45 

Query: 258 PKMVFFFDEAHLLFKDAPKVFLEKVEQIVRLIRSKGVGIPFVTQNPLDLPETVLAQLCasnR 317 

PK+VFFFDEAHLLF DAP+V L+K+EQ++RLIRSKGVG++FV+QNP D+P+ VL QLGNR 
Sbjct: 259 PKLVFFFDEAHLLFNDAPQVLLDKIEQVIRLIRSKGVGVWFVSQNPSDIPDNVLGQLGNR 318 

50 Query: 318 IQHAFRAYTPKEQKAVRVAADTFRQNPDLDVARVITELEVGEALISVLNDKGQPSIVERA 377 

+QHA RA+TPK+QKAV+ AA T R NP D + I EL GEALIS L+ KG PS+VERA 
Sbjct: 319 VQHALRAFTPKDQKAVKAAAQTMRANPAFDTEKAIQELGTGEALISFLDAKGSPSWERA 378 

Query: 378 YIMPPKSSFAVLSEIESQQLVQSSPFASKYSQSIDRESAYEKLAAKVLEDNRLAQEAIAT 437 
55 ++ P S ++E E L+ SP KY +DRESAYE L K + + Q 

Sbjct: 379 MVIAPCSRMGPVTEDERNGLINHSPVYGKYEDEVDRESAYEML-QKGFQASTEQQNNPPA 437 



Query: 438 AQREKEAKEAIKAQAATKKANRRSVGRSHRTVVEKATDAFISTTVRTIGRELVRGLLGSIi 497 
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+E+I K+ +R+ ++VRG+LGSL 

Sbjct: 438 KGKEVAVDDGIIlGGLKDILFGTTGPRGGKK---DGVVQTMAKSaaRQ^m^QITO 494 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useftil 
5 antigens for vaccines or diagnostics. 

Example 2504 

A DNA sequence (GASx502) was identified in S.pyogenes <SEQ ID 7507> which encodes the amino acid 
sequence <SEQ ID 7508>. Analysis of this protein sequence reveals the following: 

Possible site: 49 

10 

»> Seems to have an imcleavable N-term signal seq 

INTEGRAL Likelihood =-13.59 Transmembrane 59 - 75 ( 52 - 77) . 
INTEGRAL Likelihood = -9.34 Transmeitibrane 4 - 20 ( 1 - 24) 



15 ■ Final Results 

bacterial membrane Certainty=0 . 6434 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

20 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CaB15368 GB:Z99121 yvaL [Bacillus stibtilis] 
Identities = 28/72 (38%) , Positives = 44/72 (60%) , Gaps = 2/72 (2%) 

25 Query: 1 MYNLLLTILLVLSGLLEIAIFMQPQKNPSSNVFDSSGSEALFERTKARGFEAFMQRFTAV 60 

M+ +L+T+L+++S L I + +Q K+ + S G+E LF + KARG + + R T V 
Sbjct: 1 MHAVLITLLVIVSIALIIWLLQSSKSAGLSGAISGGAEQLFGEQKARGLDLILHRITW 60 

Query: 61 L--VFFWLAIAL 70 
30 L +FP L lAL 

Sbjct: 61 LAVLFFVLTIAL 72 



Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

35 Example 2505 

A DNA sequence (GASx505) was identified in S.pyogenes <SEQ ID 7509> which encodes the amino acid 
sequence <SEQ ID 7510>. Analysis of this protein sequence reveals the following: 

Possible site: 45 

40 »> Seems to have no N-terrainal signal sequence 

INTEGRAL Likelihood = -1.44 Transtnettibrane 140 - 156 ( 138 - 156) 

Final Results 

bacterial membrane Certainty=0 . 1574 (Affirmative) < suco 

45 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

50 >GP:AAF09704 6B:AE001874 glutamine cyclotransf erase [Deinococcus radiodurans] 

Identities = 81/229 (35%) , Positives = 128/229 (55%) , Gaps = 10/229 (4%) 

Query: 16 YSYDSNLYTQGLEQLNNNHILLSAGRYGFSKVGVYDL--TQEIFSEKIAFP-DTVFAEGL 72 
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y +D +TQGL+ L HLSG+GS + V+L + ++S +A F EG 



Sb j Ct : 


54 


YPHDRAAFTQGLQYLGGGHYLESTGQVGESDLRVSELRGAKVLWSTPIiRQftLPQAFGEGS 113 


Query: 


/ J 


TWEDYFWLLTYKEGVAYKFDKATCNCLGAYPFEGDGWGLAYDKENQCLWMTSGNAFLQK 132 






T + + LT+++GVA +D T G + ++G+GWGL D ++ L M++G + L 


Sbjct: 


114 


TQLGSTVYQLTWQDGVALTYDARTFKETGRHRYQGEGWGLTSDGKS- -LIMSNGTSTLVW 171 


Queiry: 


133 


RDPKDFALLDTVLmiESVPISICJSIELEYVTOYLYANIWQTNTIVKLQPDSGKVVa^ 192 






RDPK FA +V V + P+ LNELEYV G +YAN+W T+ I ++ P +GKV+ D+ 


Sb j ct : 


172 


RDPKTFAAQRSVQOTDQGQPVRISnaraiiEYVQGSTOANVWLTDRiaRIHPQTGKVLTWIDV 231 


Query: 


193 


SPLLKALNLDKSHYPDL NVMGIAHLDQQ-RFLITGKLYPLMLEV 236 






S L + ++ + +V NGIA + ++ L+TGK +P + EV 


Sb j ct : 


232 


SDLTREVSAAATKQGQALTFDDVPNGIAFIPERGTLIOiTGKRWPTLFEV 280 


Based on 


this 


analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 



antigens for vaccines or diagnostics. 



Example 2506 

A DNA sequence (GASx506R) was identified in S.pyogenes <SEQ ID 7511> which encodes the amino acid 
sequence <SEQ ID 7512>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 .2800 (Affirmative) < auco 

bacterial membrane Certaxnty=0. 0000 (Not Clear) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

Based on liiis analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2507 

A DNA sequence (GASx507R) was identified in S.pyogenes <SEQ ID 7513> which encodes the amino acid 
sequence <SEQ ID 7514>. Analysis of this protein sequence reveals the following: 

Possible site: 53 

>» Seems to have a cleavable N-term signal seq. 



INTEGRAL 


Likelihood 




•10. 


,51 


Transmembrane 


103 


- 119 


( 


97 


- 124) 


INTEGRAL 


Likelihood 




-9. 


.13 


Transmembrane 


126 


- 142 


( 


122 


- 145) 


INTEGRAL 


Likelihood 




-8. 


.65 


Transmembrane 


290 


- 306 


( 


286 


- 307) 


INTEGRAL 


Likelihood 




-7, 


,17 


Transmembrane 


200 


- 216 


( 


198 


- 228) 


INTEGRAL 


Likelihood 




-7, 


.06 


Transmembrane 


58 


- 74 


( 


54 


- 82) 


INTEGRAL 


Likelihood 




-3, 


.19 


Transmembrane 


223 


- 239 


( 


220 


- 242) 


INTEGRAL 


Likelihood 




-2 


.81 


Transmembrane 


244 


- 260 


( 


244 


- 261) 


INTEGRAL 


Likelihood 




-2.71 


Transmembrane 


174 


- 190 


( 


169 


- 191) 



Final Results 

bacterial membrane Certainty=0. 5203 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 
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>GP:CSiB56669 GB:AL121596 putative membrane protein [Streptomyces 

coelicolor A3 (2)] 

Identities = 119/322 (36%), Positives = 182/322 (55%), Gaps = 24/322 (7%) 

5 Query: 9 LETIYILIGLQLFHTAYCTFKDKTNPVyFGTALFWGLLGVTFV GGAFL 56 

+E +Y LIGL A D++NP + +A FWGLLG+TF GG L 

Sbjct: 4 VBWLYWLIGLVFVVmVQMAICIRSNPKRWTSAAFWGLLGLTFPYGTGVaNATAGNGGWXL 63 

Query: 57 PNKVIGFIVIVLALLTLFKQVRIGTLPAFNEQKAEESAHRIGNWIFLPVMLMAMISLLLA 116 
10 P + +G V+ L +L F + G ++ E +A R+GN IF+P + + +++++ A 

Sbjct: 64 PAEPLGVAVIALIVIiBGFNFIjGKXSVFVTrTGEQREARARKljGN^ 123 

Query: 117 LILPDFSKSAIGIAGILA TIAILIITKjQKPSALLAENNRMNQQVSTSGILP 167 

+11 + G A +L + +L+ ++K S + M + + ++ +LP 

15 Sbjct: 124 SVLDESGLFETGKATLLGLGLGCVAALVWMLVTGEKKLSVPIHSGRSMLEAMGSALLLP 183 

Query: 168 QLLGALGAIFAAAGVGDVIASLIREIVPADSRFFGVLAYVLGMVIFTMIMGNAFAAFTVI 227 

QLL LG+IFAAAGVGD + ++ +++P DS++F VLAY +GM +FT+IMGNAFAAF V+ 
Sbjct: 184 QLLAVLGSIFAAAGVGDQVGDIMNKVLPDDSKYFAVLAYCVGMFLFTVIMGNAFAAFPVM 243 

20 

Query: 228 TTGIGVPFVFAL--GaDPIIAGALiAMTAGFCGTLLTPMAflNENALPVaLMEIKDRNAVIK 285 

T IG P + G +P + A+ M AGF GTL TPMAANFN +P L+E+KD+ IK 

Sbjct: 244 TAAIGWPVLIQQMHGISIEPAVL-AIGMLAGFAGTLCTPMAANENIVPATLLELKDQYGPIK 302 

25 Query: 286 KQAPIALVLIISHIALMYLLAY 307 

Q P + L+ +M L A+ 

Sbjct: 303 AQLPTGIALLGCCTVIMALFAF 324 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
30 antigens for vaccines or diagnostics. 

Example 2508 

A DNA sequence (GASx508R) was identified in S.pyogenes <SEQ ID 7515> which encodes the amino acid 
sequence <SEQ ID 7516>. Analysis of this protein sequence reveals the following: 

Possible site: 61 

35 



40 



Final Results 

45 bacterial tnernbrane Certainty=0 . 5861(Af f irmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
50 The protein has homology with the following sequences in the GENPEPT database: 

s-GP:CAB56670 GB:AL121596 possible integral membrane protein 
[Streptomyces coelicolor A3 (2) ] 
Identities = 77/220 (35%) , Positives = 138/220 (62%) , Gaps = 2/220 (0%) 

55 Query: 23 IKLIGIVIIVLGFILKCDAIATVWAGLVTALVSGISFIDFLDILGKEFTNQRLLTIFFI 82 

I L+G+V+++LGF+ + + + V VAG+VT Ii+ ++ ++ L G+ F + R +T++ I 
Sbjct:' 2 IVLLGVVVVILGFVTRRNPVLVVGVAGIVTGLLGKMNPLEVLaAFGRSFADSRSVTVYAI 61 



»> Seems to have an uncleavable N-term signal seq 












INTEGRAL 


Likelihood 




•12. 


,15 


Transmembrane 


212 - 


228 


( 


208 - 


235) 


INTEGRAL 


Likelihood 




-8, 


.81 


Transmembrane 


23 - 


39 


( 


17 - 


64) 


INTEGRAL 


Likelihood 




-7, 


.43 


Transmembrane 


45 - 


61 


( 


40 - 


64) 


INTEGRAL 


Likelihood 




-1, 


.49 


Transmembrane 


114 - 


130 


( 


114 - 


130) 


INTEGRAL 


Likelihood 




-1, 


.49 


Transmembrane 


3 - 


19 


( 


3 - 


20) 


INTEGRAL 


Likelihood 




-1. 


.49 


Transmembrane 


76 - 


92 


( 


76 - 


92) 



60 



Query: 83 TLPLIGLSETYGLKHRATQLIQRVQALTVGRLLTLYLIIRELAGLFSIR-LGGHPQFVRP 141 

LP+IGL E YGL+ +A LI R+ L+ GR LT+YL++R++ F + +GG Q VRP 
Sbjct: 62 VLPVIGLLERYGLREQARHLIGRLGKLSAGRFLTVYLLVRQVTAAFGLNSIGGPAQTVRP 121 
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CJuery: 142 LIQPMGEaaAKaNIGEELTDaEKDDIKBMAAftNENFGNFPAQNTFV^ 201 

L+ PM EAaA+ + G +L D ++ +++ +A+ + G FF ++ F+ G +LLI G + 
Sbjct: 122 LVAPMaEaUiftERSTGMIiPDKLREKWRSYSASADTVGVFFGEDCPIAIGSILLITGFVNS 181 

5 

Query: 202 LGY-DGNQAKIAFSSILIAIISIIIVAIYNYLFEKKMERQ 240 

+ D ++A +1 +A+ + +1 L +K++ER+ 

Sbjct: 182 TYHQDIEPTQLALWAIPLAVCAFLIHGRRLLLMDKQLERE 221 

10 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2509 

A DNA sequence (GASx520) was identified in S.pyogenes <SEQ ID 7517> which encodes the amino acid 
sequence <SEQ ID 75 1 8>. Analysis of this protein sequence reveals the following: 

15 Possible site: 13 

>» Seems to have no N-terminal signal sequence 

Final Results 

20 bacterial cytoplasm Certainty=0 .2652 (Affirmative) < suco 

bacterial membrane — Certainty= 0.0000 (Not Clear) < suco 
bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

25 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useflil 
antigens for vaccines or diagnostics. 

Example 2510 

A DNA sequence (GASx522R) was identified in S.pyogenes <SEQ ID 7519> which encodes the amino acid 
30 sequence <SEQ ID 7520>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

»> Seems to have an uncleavable N-term signal seq 

35 Final Results 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

40 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2511 

45 A DNA sequence (GASx523) was identified ia S.pyogenes <SEQ ID 752 1> which encodes the amino acid 
sequence <SEQ ID 7522>. Analysis of this protein sequence reveals the following: 

Possible site: 22 
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»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 2133 (Affirmative) < suco 

5 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0 . 0000 (Hot Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

10 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2512 

A DNA sequence (GASx525) was identified in S.pyogenes <SEQ ID 7523> which encodes the amino acid 
sequence <SEQ ID 7524>. Analysis of this protein sequence reveals the following: 

15 , Possible site: 14 

>>> Seems to have no N-terminal signal sequence 

Final Results 

20 bacterial cytoplasm Certainty^O. 2364 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Cer1:aintyi=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

25 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefijl 
antigens for vaccines or diagnostics. 

Example 2513 

A DNA sequence (GASx535) was identified in S.pyogenes <SEQ ID 7525> which encodes the amino acid 
30 sequence <SEQ ID 7526>. Analysis of this protein sequence reveals the following: 

Possible site: 47 

»> Seems to have no N-terminal signal sequence 

35 Pinal Results 

bacterial cytoplasm Certainty=0. 4223 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

40 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefial 
antigens for vaccines or diagnostics. 

Example 2514 

45 A DNA sequence (GASx536) was identified in S.pyogenes <SEQ ID 7527> which encodes the amino acid 
sequence <SEQ ID 7528>. Analysis of this protein sequence reveals the following: 
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Possible site: 59 

>>> Seems to have no N-terminal signal sequence 



5 Final Results 

bacterial cytoplasm Certainty=0 . 1102 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

10 No corresponding DNA sequence was identified in S.agalactiae. 



The protein has homology with the following sequences in the GENPEPT database: 

>GP:AftB85515 GB:AEC00874 conserved protein [Methanobacterium 
thermoautotrophicum] 
Identities = 82/236 (34%) , Positives = 132/236 (55%) , Gaps = 11/236 (4%) 

15 



Query: 


9 


MNLSIPGLKNIPYLKEGDSIEKLIEESIKTSEFFIEDNDVLCIASKWSIAEGQVMSLNE 


68 






M +S+ G++ +P + GD I LI ++ , + D D++ lA +VS AEG ++SL E 




Sbjct: 


1 


MGISLIGVEGMPLVGAGDDIAXLIISAIiNEGGEDIiLDGDIIVIAETIVSKAEGNIISIjEE 


60 


Query : 


69 


IQVSDVAKEIHRNIPRKDPRIIEIMMLVNRDLSRLDIKKNYIGCRLEMGLKLTSGGIDR 


128 






1+ S A +1 KDP ++E +L + + ++I +G + GID 




Sb j ct : 


61 


IKPSPEALDIAERTG-KDPSLVEAILG- - -ESSEIIRVGHDFIVSETRHGFVCANAGIDE 


116 


Query: 


129 


KSVDEVFL--LENNPnASAKRISEYLKKSLGKNVAWITDSDGREDKRGATQVAIGIYGI 


186 






+VD+ LP +PD SA++1 L+++ G+ +AV+1+D+ 6R + GA VA+G+ G+ 




Sb j ct : 


117 


SNVDDGLATPLPRDPDGSAEKILRTLQEATGRELAVIISDTQGRPFREGAVGVAVGVAGL 


176 


Query: 


187 


HPL- -RKTEVIDSQGETIKFQEETLCDMIAACAGLVMGQRGTGIPAVLIRGLDYKW 240 








P+ RK E D G +++ + D +AA A LVMGQ G+PAV+IRG Y W 




Sb j ct : 


177 


SPIWDRKGE-RDLYGRSLETTRVAVADELRAAASLVMGQaDEGVPAVIIRG--YPW , 229 




Based on 


this 


analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 



antigens for vaccines or diagnostics. 



Example 2515 

35 A DNA sequence (GASx537) was identified in S.pyogenes <SEQ ID 7529> which encodes the amino acid 
sequence <SEQ ID 7530>. Analysis of this protein sequence reveals the following: 

Possible site: 50 

>>> Seems to have no N-terminal signal sequence 
40. INTEGRAL Likelihood = -1.12 Transmembrane 174 - 190 ( 174 - 190) 

Final Results 

bacterial membrane Certaintyi=0. 1447 (Affirmative) < suco 

bacterial outside Certainty=0.0000 (Not Clear) < suco 

45 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
50 antigens for vaccines or diagnostics. 

Example 2516 

A DNA sequence (GASx538) was identified in S.pyogenes <SEQ ID 753 1> which encodes the amino acid 
sequence <SEQ ID 7532>. Analysis of this protein sequence reveals the following: 
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Possible site: 32 

»> Seems to have no M-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 3852 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:A34B99212 GB:U67562 conserved hypothetical protein [Methanococcus 
jannaschii] 

Identities = 129/387 (33%) , Positives = 208/387 (53%) , Gaps = 44/387 (11%) 



Query: 


18 


EWERKGLGHPDTLADGIAEQIEIDYSLYCLDKFGVIPHHNFDKlllRGGHSVQDFGGSD 


77 






E+VERKGI1GHPD++ DGIAE + ++KFG I HHN D++ + GGH+ FGG 




Sbjct: 


20 


EIVERKBLGHPDSICIXSIAESVSRMjCKMYMEKFGTILHHNTDQVELVGGHAYPKFGGGV 


79 


Query: 


78 


FIEPIKIIFLGRASKKCFNS SIPLFKIQKKAATKYLNRILPNLDVENYVEFETL 


131 






+ PI 1+ GRA+ + + +P4- KAA +YL ++L N+DV+ V + 




Sbjct: 


80 


MVSPIYILLSGRATMEILDKEKNEVIKLPVGTTAVKAAKEYLKKVLRNVDVDKDVIID-- 


137 


Query: 


132 


TSDFTTKTNWFSPEAIEDLP-EYLDVPKANDTATMISYWPLTISEEIiALMIEGYFYKIiD- 


189 






+ S + ++ + +VP A1IDT+ + Y PL+ +E L L E + + 




Sbjct: 


138 


CRIGQGS^roLVDVFERQKNEVPLANDTSFGVGYAPLSTTERLVLETERFLNSDEL 


192 


Query: 


190 


KNELPTPRFTKMGGDIKVMVVRNDLEYSIRINFPLISKFFNNDIESQLYVDKHVEKIKKY 


249 






KNE+P +G DIKVM +R + ++ I ++ ++ N IE V +EK++K 




Sbjct: 


193 


KNEIPA VGEDIKUMGLREGKKITLTIAMAWDRYVKN- lEEYKEV- - - lEKVRKK 


243 


Query: 


250 


lEQKYKNIS— FSIDYH - - -YYLTTTGSCIDFGEEGAVGRGNKrHGIISSFR 


296 






+E K 1+ + ++ H YLT TG+ + G++G+VGRGN+ +6+1+ FR 




Sbjct: 


244 


VEDIJUCKIADGYEVEIHIIWADDYERESVYLTVTGTSAEMGDDGSVGRGNRVNGLITPFR 


303 


Query: 


297 


PNTMEAPAGKNCTYFVGKVWGFLSDTIAKEIYEAFNT- PCQI IMQLNIGSKLYRPTHLFI 


355 






P +MEA +GKN VGK++ L++ lA +1 + C++ IG + P L I 




Sbjct: 


304 


PMSMEaASGKNPVNHVGKIYNILANLIANDIAKLEGVKECYVRILSQIGKPINEPKALDI 


363 


(Juery: 


356 


Q--TEESVD QERVLEIVNRHIiNN 376 








+ TE+S D + + EI N+ L+N 




Sbjct: 


364 


EIITEDSYDIKDIEPKAKEIANKWLDN 390 





Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2517 

A DNA sequence (GASx539) was identified in S.pyogenes <SEQ ID 7533> which encodes the amino acid 
sequence <SEQ ID 7534>. Analysis of this protein sequence reveals the following: 

Possible site: 17 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 1436 (Affirmative) < suco 

bacterial membrane Certainty^O . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 



The protein has no significant homology witii any sequences in the GENPEPT database. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2518 

A DNA sequence (GASx540) was identified in S.pyogenes <SEQ ID 7535> which encodes the amino acid 
sequence <SEQ ID 7536>. Analysis of this protein sequence reveals the following: 

Possible site: 45 

>» Seems to have no N-terrainal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0 . 3956 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified in iS'.flgfl/acft'ae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AftD36304 GB:AE001779 conserved hypothetical protein [Thermotoga maritima] 
Identities = 105/353 (29%) , Positives = 173/353 (48%) , Gaps = 46/353 (13%) 

20 Query: 3 VIGIPTLNEADNISRLVKQIDEYAVNL-GKEIIIINSDSKSTDGTPQIFLETKTYISIT-KV 60 

V+GIP+ N A+ IS + + + V+ + +I+NSD S DGT + F+ET T+ K 
Sbjct: 106 VVGIPSVNNAETISHVaRTAAQGIVDFFDGDGMIVNSDGGSADGTRERFMETDTTOLPKE 165 

Query: 61 SIVSEA-KGKGYl)maJIFEYAINHVENFSGLILIDGDVVSMKKMWLEKMFIAlESGN-DL 118 
25 S V E GKG +R I E+A+ + ++ +D D+ S+K W+E++ + G D 

Sbjct: 166 SFVYEGLPGKGSAMRAIMEPALKQ--DAEAVVPLnaDLRSVKPWWVERLAGPVLKGEM)Y 223 

Query: 119 IIPNYARKSFEGNATNHFIYPMLVKIFKRDMPYQCISGDPGFSRGLIKDLTLKCN- -WHK 176 
+ P Y R F+G TN+ +PM ++ + + Q I GDFG R L++ K W+ 
30 Sbjct: 224 VTPFYLRHRFDGTITNNVCPPMTAVLYGKKVR-QPIGGDFGVGRKLLEIYLGKPKEIWNT 282 

Query: 177 YTrfiyGlDIFLTI.TAILKSYKIKEIDLQSKIH--KKSFEKIEKIFLEVSQSFPETINDNS 234 
+G1DI++T TAI +S ++ + L +K+H K + ++ +PL+V + FE + ■ 

Sbjct: 283 DVARFGIDIWMTTTAINESGRVVQAALGTKVHDVKDPGKHLKGMPLQVVGTLPELV 338 

35 . 

Query: 235 LNQDKLRUSriNPESHSRQFIKSSDI -LSSNDIENLKLRALFLLQEEKQY 282 

I +E+ ++ K D+ S DI NLK A L+E + 

Sbjct: 339 ITYENVWKEIWKIEDVPIYGETPQEEVPSMSIDICmiKKIiARETLEEVEYI 389 

40 Query: 283 LHG-LSEVEWDGI--LSNTINNIYRYSSEEHSL YLLPLYLLRVYNY 325 

G LSEV+ G LS+ ++ +YR + + LLP Y R + 

.Sbjct: 390 DRGILSEVKESGTLSLSSWVDTLYRSAVQYRKTRDKKVVENLLPFYFARTARF 442 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
45 antigens for vaccines or diagnostics. 

Example 2519 

A DNA sequence (GASx542) was identified in S.pyogenes <SEQ ID 7537> which encodes the amino acid 
sequence <SEQ ID 7538>. Analysis of this protein sequence reveals the following: 



50 



Possible site: 20 

>» Seems to have an unoleavable N-term signal seg 

INTEGRAL Likelihood = -5.31 Transmembrane 3 - 19 ( 1-21) 



Final Results 

55 bacterial membrane Certainty=0. 3 123 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 
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bacterial cytoplasm Certainty=0 . 0000 {Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAB07427 GB:AP001519 nucleotide sugar dehydrogenase [Bacillus halodurans] 
Identities = 184/388 (47%) , Positives = 274/388 (70%) , Gaps = 3/388 (0%) 

Query: 1 MKITWGIGWGLSIGi:iIiIAKEHDOTFFDIDNKiaDIiINKRQSPliKEftftINKU:.C-KA^ 59 

M IT+ G GYVGLS +LLA+ +07 +DI +K+D+IN R+SP+ + I + L K N 
Sbjct: 1 MNITIAGTGYVGLSNAVLLAQHNDVIAYDIVQEKVDMINNRKSPIVDREIEEF^ 60 

Query: 60 INATSSEELAYKDATFIILSLPTNL--KFNKLDTSIIEISVSNILKINKKATIVIKSTVP 117 

+ AT+ +E A+KDA F+++S PIN + N DTS +E +S++L IN A +VIKST+P 
Sbjct: 61 LTATTDKEKAFKDAQFWISTPTlra)PEK^IYFDTSSVEAVISDVLSINE^IAVMVII^ 120 

Query: 118 IGFTEYLRNRFHYNDIIFSPEFLREGSTIHDQIiyPSRTIVGNESRNSQLFIiDILTDISVE 177 

+G+T + RF+ +IIFSPEFLREGS ++D L+PSR +VG ++ +++F +L +++ 
Sbjct: 121 VGYTREVNERFNTKNIIFSPEFLREGSALYDNLHPSRIWGERTQRAKIFAALLVQGAIK 180 

Query: 178 KDSPSLLVGSSEAEAIKLFSNAYLAQKIAFFNELDTFAEMQNLDSKKIIEAMGYDQRIGN 237 

++ L S+EAEAIKLF+N YLA ++AFFNELD++AE++ LD+K+II+ +G D RIG 
Sbjct: 181 ENIDVLFTDSTEAEAIKLFAOTTLftMRVAFENELDSYAELKGIjDARQIIDGVGLDPRIGT 240 

Query: 238 SHNNPSFGFGGYCLPKDIKQLEYHFKEIPAPIITSISESNLLRKIHIAKMIUlISSAKTIG 297 

+NNPSFG+G6YCLPKD KQL +F+++P II +1 ++N RK H+A MIL K +6 
Sbjct: 241 HYOTPSFGYGGYCLPKDTKQLIANFEDVPlTOTIIGAIVDaMDTRKDHVAim 300 

Query: 298 lYRINSKKDSDNCRESSTIDVAKLLKSSGKDVIIFEPLINQKKFLGCPLSNDFNEFIKYS 357 

IYR+ K SDN R+S+ +DV L ++G +V+++EP ++ +F G + DF EF K S 
Sbjct: 301 lYRLTMKTCSDNFRQSAIIDVMTRIJQNAGMVVVYEPALnATEFTC^ 360 ■ 

Query: 358 DIIVaNRIDDALRKCNSKVFTRDIFQYD 385 

D+IVANRH- D L++ KV+TRD++ D 
Sbjct: 361 DVIVANRLSDOLKEVAEKVYTRDLYTRD 388 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2520 

A DNA sequence (GASx544R) was identified in S.pyogenes <SEQ ID 7539> which encodes the amino acid 
sequence <SEQ ID 7540>. Analysis of this protein sequence reveals the following: 

Possible site: 34 

»> Seems to have no N- terminal signal sequence 

INTEGRAL Likelihood = -0.06 Transmembrane 46 - 62 ( 46 - 62) 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 



Final Results 



bacterial membrane 
bacterial outside 
bacterial cytoplasm 



- Certainty^O. 1022 (Affirmative) < suco 

- Certainty=0 . 0000 (Not Clear) < suco 

- Certainty=0. 0000 (Not Clear) < suco 
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Example 2521 

A DNA sequence (GASx545R) was identified in S.pyogenes <SEQ ID 754 1> which encodes the amino acid 
sequence <SEQ ID 7542>. Analysis of this protein sequence reveals the following: 

Possible site: 58 

>>> Seems to have no N- terminal signal sequence 

INTEGRAL Likelihood = -1.49 Transmembrane 186 - 202 ( 186 - 203) 

Final Results 

bacterial membrane — Certaintyi=o . 1595 (Affirmative) < suco 
bacterial outside — Certainty=0. 0000 (Not Clear) < suco 
bacterial cytoplasm — Certaintyi=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiU 
antigens for vaccines or diagnostics. 

Example 2522 

A DNA sequence (GASx546R) was identified in S.pyogenes <SEQ ID 7543> which encodes the amino acid 
sequence <SEQ ID 7544>. Analysis of this protein sequence reveals the following: 

Possible site: 47 

»> Seems to have no N-terminal signal sequence 

— ... Final Results 

bacterial cytoplasm — Certainty=0 . 2422 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) • < suco 

bacterial outside Certaintyi=0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2523 

A DNA sequence (GASx547R) was identified in S.pyogenes <SEQ ID 7545> which encodes the amino acid 
sequence <SEQ ID 7546>. Analysis of this protein sequence reveals the following: 

Possible site: 60 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 1612 (Affirmative) < suco 

bacterial metribrane CertaintY=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty^O . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2524 

A DNA sequence (GASx548) was identified in S.pyogenes <SEQ ID 7547> which encodes the amino acid 
5 sequence <SEQ ID 7548>. Analysis of this proteia sequence reveals the following: 

Possible site: 44 

»> Seems to have no N-terminal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0. 5156 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified ia S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefial 
antigens for vaccines or diagnostics. 

Example 2525 

20 A DNA sequence (GASx549R) was identified in S.pyogenes <SEQ .ID 7549> which encodes the amino acid 
sequence <SEQ ID 7550>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

>» Seems to have a cleavable N-term signal seq. 

25 

Final Results 

bacterial outside Certainty=0. 3000 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintyi=0. 0000 (Not Clear) < suco 

30 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

35 Example 2526 

A DNA sequence (GASx552) was identified in S.pyogenes <SEQ ID 755 1> which encodes the amino acid 
sequence <SEQ ID 7552>. Analysis of this protein sequence reveals the following: 

Possible site : 15 

40 >>> Seems to have no N-terminal signal sequence 

INTEGRftli Likelihood = -0.59 Transmembrane 83 - 99 ( 83 - 99) 

■ — Final Results 

bacterial membrane Certainty^O. 1235 (Affirmative) < suco 

45 bacterial outside — Certainty^sO. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2527 

A DNA sequence (GASx553) was identified in S.pyogenes <SEQ ID 7553> which encodes the amino acid 
sequence <SEQ ID 7554>. Analysis of this protein sequence reveals the following: 

Possible site: 49 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2781 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2528 

A DNA sequence (GASx554) was identified in S.pyogenes <SEQ ID 7555> which encodes the amino acid 
sequence <SEQ ID 7556>. Analysis of this protein sequence reveals the following: 

Possible site: 18 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 .2792 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2529 

A DNA sequence (GASx555) was identified in S.pyogenes <SEQ ID 7557> which encodes the amino acid 
sequence <SEQ ID 7558>. Analysis of this protein sequence reveals the following: 

Possible site: 35 

>» Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -0.00 Transmembrane 49 - 65 ( 49 - 65) 

Final Results 

bacterial membrane Certainty=0 .1001 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAa36631 GB:AB016282 0RP25 [bacteriophage phi-105] 
5 Identities = 43/118 (36%) , Positives = 69/118 (58%) , Gaps = 2/118 (1%) 

Query: 3 LLDLIGRKRARDKPQNSyEGQDFSyLFG--RTTSGENVDEFKTMQTTAVYACVRVLAEAV 60 
LL+ + KR+ +FG +T SGE V E ++ ++ACV VL++ + 

■ Sbjct: 2 LLERMFEKRSGSSDHEDGFliraiLiamFGGRKTASGERVSESNSLVQPDIFACVimi 61 

10 

Query: 61 ASLPIHIYERTENGKEKKLDHPLYFLLHDEPNPEMSSFIFRETIMSHLLIWGNAYVQI 118 

A LPIH Y+RT+ G E+K +H ++ PNP M++F +++ +M+H+L WGNAY I 

Sbjct: 62 AKLPIHTYKRTDGGIKRKPEHKSAHAVYARPNPYMTAFTWKKLMMTHVLTW6NAYSYI 119 

15 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2530 

A DNA sequence (GASx556) was identified in S.pyogenes <SEQ ID 7559> which encodes the amino acid 
sequence <SEQ ID 7560>. Analysis of this protein sequence reveals the following: 

'20 Possible site: 43 

»> Seems to have no N-terminal signal sequence 

Final Results 

25 • bacterial cytoplasm Certainty=0 . 2055 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

30 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2531 

A DNA sequence (GASx557) was identified in S.pyogenes <SEQ ID 7561> which encodes the amino acid 
35 sequence <SEQ ID 7562>. Analysis of this protein sequence reveals the following: 

Possible site: 50 

>>> Seems to have no N-terminal signal sequence 

40 Final Results 

bacterial cytoplasm — Certainty=0 , 1696 (Affirmative) < suco 
bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial outside Certainty=0. 0000 (Not Clear) < suco 

45 No corresponding DNA sequence was identified in S.agalactiae. 

The protem has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 
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Example 2532 

A DNA sequence (GASx559) was identified in S.pyogenes <SEQ ID 7563> which encodes the amino acid 
sequence <SEQ ID 7564>. Analysis of this protein sequence reveals the foUowiag: 

Possible site: 51 

5 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certaintyi=0. 1556 (Affirmative) < suco 
10 bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified ia S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

15 >GP:CAB1S798 GB:Z99123 alternate gene name: ipa-83d [Bacillus subtilis] 

Identities = 70/263 (26%) , Positives = 121/263 (45%) , Gaps = 25/263 (9%) 

Query: 68 KTIEQIKELK--YSinAVACWDEALTHIADDISKELGLNPISSIiDSQSPRFKDRMRMVCE 125 
+ +KQI ++ + DA+ +E + LGL +++ R K++MR 

20 Sbjct: 87 EVVBQIVKVaEMFG!amiTTNNELFIAPMMACERLGLRGa6VQAftENARDKNK»^^ 146 

Query: 126 AGGLKMPK:yKIINQFSDTNKIINW-KYPLIVKPTSFIASIGVK3CVYNFSELQQft.VSQMLN 184 

G+K K K + D + PLI+KPT +SIGV + + + +++ + 

Sbjct: 147 KftGVKSIKNKRVTTLEDFRAALEEIGTPLILKPTYLASSIGVTLITDTETAEDEENRVND 206 

25 

Query: 185 VKFPVYIASGVYELGELTOLEPRVLVEEFIDGE EY-SLESWRNGIYTP 232 

+ + V E + EEF+ 6E +Y S+E ++ +G Y P 

Sbjct: 207 YLKSINVPKAV TFEAPFIAEEFLQGEYGDWYQTEGYSDYISIEGIMADGEYFP 259 

30 Query: 233 LGITKKIVDEKLFMDEIGHIFPSNLNKEEKSRVYSWAEKLHQILQLNHITTHTEFRIGRN 292 

+ 1 K ++ E HI PS L++E K ++ A+K ++ L L + THTE ++ +N 
Sbjct: 260 IAIHDKT--PQIGPTETSHITPSII£lEEAKlCKIVEAAKKaNEGLGLQNCATHTEIKIiMKN 317 

Query: 293 GDIILIEIGARIGG-DCIENLMK 314 
35 , + LIE AR G + IEN+ K 

Sbjct: 318 KEPGLIESARRFAC3WNMIENIKK 340 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

40 Example 2533 

A DNA sequence (GASx561) was identified in S.pyogenes <SEQ ID 7565> which encodes the amino acid 
sequence <SEQ ID 7566>. Analysis of this protein sequence reveals the following: 
Possible site: 55 

45 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2602 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

50 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 



Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
55 antigens for vaccines or diagnostics. 
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Example 2534 

A DNA sequence (GASx562) was identified in S.pyogenes <SEQ ID 7567> which encodes the amino acid 
sequence <SEQ ID 7568>. Analysis of this protein sequence reveals the following: 

Possible site: 34 

5 

>>> Seems to have no N- terminal signal sequence 

Final Results 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

10 bacterial outside CertaintyaO. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

15 >GP:AaD06696 GB:AE001539 HISTIDYL-TRNA SYNTHETASE [Helicobacter 

pylori J99] 

Identities = 75/309 (24%) , Positives = 129/309 (41%) , Gaps = 35/309 (11%) 

Query: 11 KHYRRQENQILLGAWGIESAYVDAEIIVATWRGLQRPKGIKVE--FIQLSNKNIFDVLEK 68 
20 KG R+P Q G ES DAEII L K + +E + ++++ I + + + 

Sbjct: 115 KGRYREFTQCDFOFIGSESLVCDAEHQVIIASL-— KALDIiEDFCVSINHRKIIiNGICE 171 

Query: 69 DLSKKLRFEDISIEAILGKYLCNNDIEIIKCLYEKDKINMELLISLISKISNKLVKQEFX 128 
E + I L K N E +K + D ++ L+ ++ N L EF 
25 Sbjct: 172 YFGIAQVNEVLRlVDKLEKIGIJSIGVEEELKKECDLDSNTIKDLLEMVQIRQIiroLSHAEFF 231 

Query: 129 -KVLVLYEYVKNFLP VDCIYFSLS NLY GTGHYSSMNYKIFIR 169 

K+ L +Y +N ++ +Y L NLY G G+Y+ + Y+ + 

Sbjct: 232 EKIAYLKDYNENLKKGIQDIiERLYQLLGDIiQISQNLYKIDFSIARGLGYYTGIVYETTLN 291 

30 

Query: 170 TKSGDIFDIADGGRIDDMVSKFNKVNVLGVCMGIGTTVLSQEI -EYEIEDRIMI 222 

+ + GGR D + F+K N+ GV IG L + , E + 

Sbjct: 292 DMKS-IjGSVCSGGRYDHLTKNFSKENLQGVGASIGIDRLIVALSEMQLIiDERSTQAKVLI 350 

35 Query: 223 LVEKIDVKIYKNCLELANKLSGYHCSVFEFPYKKIKKFFKHELYSRHHYIIVRLDGSMEY 282 

+ Y N L + + SG V+ +KIKK P + + H ++ V G E+ 
Sbjct: 351 ACMHEEYFSyANRLAESLRQSGIFSEVYP-EAQKIKKPFSYANHKGHEPVAV--IGEEEF 407 

Query: 283 RFSSVALKN 291 
40 + +++LKN 

Sbjct: 408 KSETLSLKN 416 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

45 Example 2535 

A DNA sequence (GASx564) was identified in S.pyogenes <SEQ ID 7569> which encodes the amino acid 
sequence <SEQ ID 7570>. Analysis of this protein sequence reveals the following: 

Possible site: 56 

50 >» Seems to have no H-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 1264 (Affirmative) < suco 

bacterial membrane Certainty^O. 0000 (Not Clear) < suco 

55 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2536 

5 A DNA sequence (GASx576) was identified in S.pyogenes <SEQ ID 7571> which encodes the amino acid 
sequence <SEQ ID 7572>. Analysis of this protein sequence reveals the following: 

Possible site: 60 

»> Seems to have a cleavable N-term signal seq. 

10 

Final Results 

bacterial outside Certainty=0. 3000 (Affirmative) < suco> 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
bacterial cytoplasm Certainty^O. 0000 (Not Clear) < suco 

15 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

20 Example 2537 

A DNA sequence (GASx577R) was identified in S.pyogenes <SEQ ID 7573> which encodes the amino acid 
sequence <SEQ ID 7574>. Analysis of this protein sequence reveals the following: 

Possible site: 17 

25 »> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -2.6.0 Transmembrane 2 - 18 ( 1 - is) 

Final Results 

bacterial membrane Certainty=0. 2041 (Affirmative) < suco 

30 bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintys^O. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae: 

The protein has no significant homology with any sequences in the GENPEPT database. 

35 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2538 

A DNA sequence (GASx579) was identified in S.pyogenes <SEQ ID 7575> which encodes the amino acid 
sequence <SEQ ID 7576>. Analysis of this protein sequence reveals the following: 

40 Possible site: 13 

»> Seems to have no N-terminal signal sequence 

Final Results 

45 bacterial cytoplasm Certainty=0 . 3161 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 
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10 



No corresponding DNA sequence was identified in S.agalactiae. 
The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAB12286 GB:Z99106 similar to hypothetical proteins [Bacillus subtilis] 
Identities = 62/140 (44%) , Positives = 88/140 (62%) , Gaps = 3/140 (2%) 

Query: 3 LTNWQEVSLADFGKPrJHHKaYmKRLRITCGRPFPKDGHLDraPRMLEEHG^ 62 

L +++S F KP H+A +N RLKTTGGR+ +++ N + L EHG 1+ 

Sbjct: 6 IKJKLTEDISETyFKOTFRHQALFNDRLKTTGGRYIiTSHNIEJOTUCyLIEHGRE 65 

Query: 63 RHELCHYHLYFBGRGYHHKDRDFKDIjIAC3VM3LRY---VPTSSKSKnraHYSCQTq^ 119 

+HELCHYHL+ EG+GY H+DRDF+ LL QVN R+ + +++K + Y C TCGQ Y 
Sbjct: 66 KHELCHYHLHLEGKGYKHRDRDFRMLLQQVNAPRFCTPLKKKAENKKTYMYICTTCGQQY 125 

15 Query: 120 QRKRRINLAKYVCCaJCHGKI. 139 

+KR +N +Y CX3 C GK+ 
Sbjct: 126 IKKRJiMNPDRYRaSKCaiGKI 145 

Based on this analysis, it was predicted that tibis GAS-specific protein and its epitopes, coxild be useful 
20 antigens for vaccines or diagnostics. 

Example 2539 

A DNA sequence (GASx587R) was identified in S.pyogenes <SEQ ID 7577> which encodes the amino acid 
sequence <SEQ ID 7578>. Analysis of this protein sequence reveals the following: 

Possible site: 53 

25 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood =-10.40 Transmembrane 46 - 62 ( 39 - 89) 
INTEGRAL Likelihood = -5.36 Transmembrane 65 - 81 ( 63 - 89) 

30 Final Results 

bacterial membrane Certainty=0. 5161 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

35 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2540 

40 A DNA sequence (GASx590R) was identified in S.pyogenes <SEQ ID 7579> which encodes the amino acid 
sequence <SEQ ID 7580>. Analysis of this protein sequence reveals the following: 

Possible site: 35 

»> Seems to have no N-terminal signal sequence 

45 



50 



Final Results 

bacterial cytoplasm Certainty=0 .2036 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2541 

5 A DNA sequence (GASx592R) was identified in S.pyogenes <SEQ ID 7581> which encodes the amino acid 
sequence <SEQ ID 7582>. Analysis of this protein sequence reveals the following: 

Possible site: 23 

»> Seems to have a cleavable N-term signal seq. 
10 INTEGRAL Likelihood = -4.62 Transmembrane 25 - 41 ( 24 - 43) 

Final Results 

bacterial membrane Certainty=0. 2848 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

15 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
20 antigens for vaccines or diagnostics. 

Example 2542 

A DNA sequence (GASx600) was identified in S.pyogenes <SEQ ID 7583> which encodes the amino acid 
sequence <SEQ ID 7584>. Analysis of this protein sequence reveals the following: 

Possible site: 24 

25 

»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -2.18 Transmembrane 3 - 19 ( 2-19) 

Pinal Results 

30 bacterial membrane Certainty=0. 1871 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco , 

bacterial cytoplasm CertaintYs=0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

35 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefijl 
antigens for vaccines or diagnostics. 

Example 2543 

A DNA sequence (GASx603R) was identified in S.pyogenes <SEQ ID 7585> which encodes the amino acid 
40 sequence <SEQ ID 7586>. Analysis of this protein sequence reveals the following: 

Possible site: 48 

»> Seems to have no N-terminal signal sequence 

45 Final Results 

bacterial cytoplasm Certainty=0 .3027 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 
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bacterial outside Certainty=0 . 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 



5 



The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAA03927 GB:AJ000109 gluthatione peroxidase [LactoCOCC3US laotis] 
Identities = 79/133 (59%) , Positives = 103/133 (77%) 



10 



Query: 1 VVLVVNTATKCGLTPQYQftLQaLYDTYHDKGFEVLDFPCNQFMfQAPGDAEEINHFCSLT 60 

W+WNTA+KCSS TPQ++ L+ LY+TY D+G E+L FPCNQF NQ G+ EIN FC L 
Sbjct: 25 WIVVOTASKCSFTPQFEGLEKLYETYKDQGLEIIfiFPOSrQFANQIWSElSITEINE^^ 84 



Query: 61 YHTTFPRFAKIKVNGKDADPLFTWLKEEKSGPLGKRIEVMFTKFLIDQNGQVIKRYSSKT 120 

Y TF F KIKVEGK+A PL+ +LK+E G Ij I+WNFTKFLID++GQVI+R++ KT 
Sbjct: 85 YGVTFTMFQOKMGKEAHPLYQFLKaCEAKCSaiLSGTIKraFTKFLIDRDGQVIERFAPK^ 144 



15 



Query: 121 DPKLIEEDLKALL 133 

+P+ +EE++K LL 
Sbjct: 145 EPEEMEEEIKKLL 157 



20 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2544 

A DNA sequence (GASx605) was identified in S.pyogenes <SEQ ID 7587> which encodes the amino acid 
sequence <SEQ ID 7588>. Analysis of this protein sequence reveals the following: 

25 Possible site: 26 



No corresponding DNA sequence was identified in S.agalactiae. 

35 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for yaccines or diagnostics. 

Example 2545 

A DNA sequence (GASx608R) was identified in S.pyogenes <SEQ ID 7589> which encodes the amino acid 
40 sequence <SEQ ID 7590>. Analysis of this protein sequence reveals the following: 



>» Seems to have no N-terminal signal sequence 



Final Results 



30 



bacterial cytoplasm Certainty=0 .3687 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty^O. 0000 (Not Clear) < suco 



Possible site: 17 



>>> Seems to have no N-terminal signal sequence 



45 



Final Results 



bacterial cytoplasm Certainty=0. 1327 (Affirmative) < suco 

bacterial membrane Certainty^O. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 



50 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2546 

A DNA sequence (GASx616) was identified in S.pyogenes <SEQ ID 7591> which encodes the amino acid 
5 sequence <SEQ ID 7592>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

»> Seems to have no N-terminal signal sequence 

10 Final Results 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2547 

20 A DNA sequence (GASx617R) was identified in S.pyogenes <SEQ ID 7593> which encodes the amino acid 
sequence <SEQ ID 7594>. Analysis of this protein sequence reveals the following: 

Possible site: 36 



25 



30 



>» Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0 . 0677 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

35 Example 2548 

A DNA sequence (GASx622R) was identified in S.pyogenes <SEQ ED 7595> which encodes the amino acid 
sequence <SEQ ID 7596>. Analysis of this protein sequence reveals the following: 

Possible site: 16 

40 »> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -7.32 Transmembrane 4 - 20 ( 1 - 26) 

Final Results 

bacterial membrane Certainty=0 .3930 (Affirmative) < suco 

45 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S. agalactiae. 
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The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccmes or diagnostics. 

Example 2549 

5 A DNA sequence (GASx632) was identified in S.pyogenes <SEQ ID 7597> which encodes the amino acid 
sequence <SEQ ID 7598>. Analysis of this protein sequence reveals the following: 

Possible site: 31 

»> Seems to have no N- terminal signal sequence 
10 INTEGRAL Likelihood = -3.40 Transtnenibrane 83 - 99 ( 82 - 102) 

INTEGRAL Likelihood = -1.28 ' Transmembrane 108 - 124 ( 108 - 124) 

Final Results 

bacterial membrane Certainty=0 . 2359 (Affirmative) < suco 

15 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty!=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

20 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2550 

A DNA sequence (GASx638) was identified in S.pyogenes <SEQ ID 7599> which encodes the amino acid 
sequence <SEQ ID 7600>. Analysis of this protein sequence reveals the following: 

25 Possible site: 25 

»> Seems to have an uncleavable N-tem signal seg 

INTEGRAL Likelihood = -0.64 Transmembrane 12 - 28 ( 12, - 28) 

30 Final Results 

bacterial membrane — Certainty=0. 1256 (Affirmative) < succ> 
bacterial outside — Certainty=0. 0000 (Not Clear) < suco 
bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

35 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that ffcds GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2551 

40 A DNA sequence (GASx652R) was identified in S.pyogenes <SEQ ID 7601> which encodes the amino acid 
sequence <SEQ ID 7602>. Analysis of this protein sequence reveals the following: 

Possible site: 16 

»> Seems to have no N-terminal signal sequence 

45 

Final Results 

bacterial cytoplasm Certainty=0. 2622 (Affirmative) < suco 
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bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

5 The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAA74610 GB:Y14232 hypothetical protein [Bacteriophage TP901-1] 
Identities = 225/485 (46%) , Positives = 308/485 (63%) , Gaps = 20/485 (4%) 

Query: 2 RKVAIYSRVSTINQaEEGYSIQGQIEftLTKYCEAMEWKIYKIireSDAGFSGGKLERPAITE 61 
10 +KVAIY+RVST NC3AEEG+SI QI+ LTKY EAM W++ Y+DAGFSG KLERPA+ 

Sbjct: 3 KKVAIYTRVSTTNQAEEGPSIDEQIDRLTKYAEMGWQVSDTYTDftGFSGa^ 62 

Query: 62 LIEDGKMmCFDTILVYKLDRLSRNWaDTLYLVKDVFTANNIHFVSLKENIDTSSAMGNLF 121 

LI D +N FDT+LVYKLDRLSR+V+DTLYLVKDVFT N I F+SL E+IDTSSAMG+LF 
15 Sbjct: 63 LINDIENKAFDTVLVYKLDRLSRSVRDTLYLVKDVFTKNKIDFISLNESIDTSSAMGSLF 122 

Query: 122 LTIiSAIAEFEREQIKERMQFGVMIRAKSGKTTAWKrPPYGYRYNKDEierLSVNELEAAN 181 

LT+LSAI EFERE IKERM G + RAKSGKh- W +GY +N+ L + L+A 
Sbjct: 123 LTILSAINEFERENIKEHMTMGKLGRAKSGKSMmKTAFGYYHNRKTGILEIVPLQATI 182 

20 

Query: 182 VRQMFDMIISGCSIMSITNYARDN-FVGN--TWTHVKVKRILENETYKGLVKYREQTFSG 238 

V Q+F +SG S+ + + ++ +G W++ +++ L+N Y G +K+++ F G 
Sbjct: 183 VEQIFTDYLSGISLTKIiRDmffiSGHIGKDIPWSYRTIJlQTLnNFVYaSYIKFKDSLF^ 242 

25 Query: 239 DHQAIIDEKTYNKftQIALAHRT DTKnJTRPFQGKYMLSHIAKCGYCGAPLKVCTGR 294 

H+ II +TY K Q L R + N RPFQ KYMLS +A+CGYCGAPLK+ G 

Sbjct: 243 MHKPIIPYETYLKVQKELEERQQQTYERNNNPRPFQAKYMLSGMaRCGYCGAPLKIVLGH 302 

Query: 295 AKNDGTRRQTYVCVNKTESLARRSVNNYNNQKICljmSRYEKKHIEKYVIDVLYKLQHDKE 354 
30 + DG+R Y C N+ + + YH+ K C++G Y+ ++E VXD L Q + + 

Sbjct: 303 KRKDGSRTMKYHCANRFPR-KTKGITVYNDNKKCDSGTYDLSNLENTVIDNLIGFQENND 361 

Query: 355 YLKKIKKDDN--IIDITPLKKEIEIIDKKI]miNDLYINDLIDLPKLKKDIEELNHLKDD 412 
L KI +N I+D + KK+I IDKKI + +DLY+ND I + +LK + L K 
35 Sbjct: 362 SLLKIINGNNQPILDTSSFKRQISQIDKKIQKNSDLYIJSIDFITMDELKDRTDSLQAEK-- 419 

Query: 413 YNKAIKLNYLDKKNEDSLGML MDNLDIRKSSYDVQSRIVKQLIDRVEVTMDNID 466 

K +K + K DS + + ++ I + SYD + +IV L+ +V+VT DN+D 

Sbjct: 420 --KLLKAKISENKFNDSTDVFELVKTOLGSIPINEIiSYDNKKKIVNNLVSKVDVTADl!^ 477 

40 

Query: 467 IIFKF 471 

IIFKF 
Sbjct: 478 IIFKF 482 

45 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2552 

A DNA sequence (GASx653R) was identified in S.pyogenes <SEQ ID 7603> which encodes the amino acid 
sequence <SEQ ID 7604>. Analysis of this protein sequence reveals the following: 
50 Possible site: 48 

>» Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -1.22 Transmembrane 86 - 102 ( 86 - 102) 

55 Final Results 

bacterial membrane Certainty=0 . 1489 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 



60 No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAF12707 GB:AF066865 unknown [bacteriophage TPW22] 
Identities = 45/67 (67%) , Positives = 53/67 (78%) , Gaps = 2/67 (2%) 

5 Query: 57 EKEAWCPKCKSTNVGFMQQGKKTFSVKKAVAGTLLIG--GTVMGFLGEKGKKQWHCNEC 114 

+K A++CPKCKST+V FMQQGKK FSV KAV G +L G GT+ GF G+KGKKQWHCN C 
Sbjct: 138 DKHAIKCPKCKSTDWFMQQGKKGFSVGKAVGGftVLTGGIGTiaGPAGKKjGKRQra 197 

Query: 115 SCIFETK 121 
10 +FETK 

Sbjct: 198 GRVFETK 204 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

15 Example 2553 

A DNA sequence (GASx655) was identified in S.pyogenes <SEQ ID 7605> which encodes the amino acid 
sequence <SEQ ID 7606>. Analysis of this protein sequence reveals the following: 

Possible site: 50 

20 >» Seems to have no N-teirminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 3956 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

25 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAB636,61 GB:AJ251789 Cro protein [Lactobacillus casei 
30 bacteriophage A2] 

Identities = 43/76 (56%) , Positives = 55/76 (71%) 

Query: 26 MTINLKHLKAERIASGMTQCEVAQSMGWRTRTPYAKRENGIVSIGADEIAKITLIFGLPI 85 
MT+NLKRL+AERIA GM Q E+A++M6W TR+ YAKRENGI +1 A EL K+ I G 
35 Sbjct: 1 miiNLKRLRAERIAKGMNQDEMAKftMGWHTOSSYAKRENGITTISATE^ 60 



40 



Query: 86 EKIAIFFDKDVPVMER 101 

++ +FF +VP ER 
Sbjct: 61 NQLDLFFTNNVPDRER 76 



Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Exiample 2554 

A DNA sequence (GASx656) was identified in S.pyogenes <SEQ ID 7607> which encodes the amino acid 
45 sequence <SEQ ID 7608>. Analysis of this protein sequence reveals the following: 

Possible site: 34 

»> Seems to have no N-tertcdnal signal sequence 

50 Final Results 

bacterial cytoplasm Certainty=0. 4505 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

5 Example 2555 

A DNA sequence (GASx657) was identified in S.pyogenes <SEQ ID 7609> which encodes the amino acid 
sequence <SEQ ID 7610>. Analysis of this protein sequence reveals the following: 

Possible site: 35 

10 >>> Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 6593 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

15 bacterial outside Certainty= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
20 antigens for vaccines or diagnostics. 

Example 2556 

A DNA sequence (GASx658) was identified in S.pyogenes <SEQ ID 7611> which encodes the amino acid 
sequence <SEQ ID 7612>. Analysis of this protein sequence reveals the following: 

Possible site: 32 

25 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 5244 (Affirmative) < suco 

30 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

35 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2557 

A DNA sequence (GASx660) was identified m S.pyogenes <SEQ ID 7613> which encodes the amino acid 
sequence <SEQ ID 7614>. Analysis of this protein sequence reveals the following: 

40 Possible site: 58 

»> Seems to have no N-terminal signal sequence 

Final Results 

45 bacterial cytoplasm Certaintyi=0 . 1133 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 
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bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

5 >GP:AAB99331 GB:D67572 purine NTPase [Methanococcus jannaschii] 

Identities = 71/346 (20%) , Positives = 154/346 (43%) , Gaps = 52/346 (15%) 

Query: 8 MSITINKLEIENVK RIKAVKIEPSATGLTIIGG1INNQGKTSVLDAIAWAL--GGN 60 

MS+ + ++ + N K RIK K G+ I G N GK+S+ +A+ +AL G+ 

10 Sbjct: 1 MSMILKEIRMNNFKSHVNSRIKFEK- GIVAIIGENGSGKSSIFEAVFFALFGAGS 54 

Query: 61 KYKPSQftMREGSQ- - -VPPTLKITMSNGLIVERKGKNASLKVIDENGQ- - - --EG 107 

+ + +G + V ++ +N 1+ + NG+ K 

Sbjct: 55 NENYDTIITKBKKSVYVEI^FEVNGNNYKIIREYDSGRGGAKLYKNGKPYATTISAVl^ 114 

15 

Query: 108 GQQLL DSFVEELAI NLPKFMDSTPKEKADVLLEIIGVGDQLAELELKEKEIYN 160 

++L + F+ + I + KF+ P EK + + +++G+ D+ + K EI 
Sbjct: 115 VbffllLGVDRimFIiNSIYIKQGEIAKFLSLKPSEKLETVaKLLGI-DEFEKCYQKMGEIVK 173 

20 Query: 161 QRHAIGVIADQKEKFAKEMTyyPDAPEQriVS-ISEIjIQQHQaiLaKNGE-NAQKR--QNV 216 

+ + E+ E+ y + K+L + +S+L ++++ ++ N + M K+ +++ 

Sbjct: 174 E YEKRLERIEGELNYKENYEKELKNKMSQLEEKNKm.MEIM)KLNKIKKEFEDI 227 

Query: 217 ERIRYDYNQSILEVDRIJlKIiftDAEAICrNKLSEDLKIANTD AMDLHDESTAEIE 270 

25 E++ ++ L ++ L + + +++LKI D A + + E E 

Sbjct: 228 EKLENEWENKKLLYEKFIimiEERKRALELKNQELKILEYDLimATEftRETI^^ 287 

Query: 271 ANIADIDEVTOKVRANFDKDKAE-EDAKQQREQYNILTNDIESIRQ 315 
+ +DE+ RK+ + + K+ ED + +Q 1+ DIE +++ 
30 Sbjct: 288 KYKSLVDEI-RKIESRLRELKSHYEDYLKLTKQLEIIKGDIEKLKE 332 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2558 

35 A DNA sequence (GASx661) was identified in S.pyogenes <SEQ ID 7615> which encodes the amino acid 
sequence <SEQ ID 7616>. Analysis of this protein sequence reveals the following: 

Possible site: 28 

>» Seems to have no H-terminal signal sequence 



40 



45 



Final Results 

bacterial cytoplasm Certainty=0 . 1559 (Affirmative) < suco 

bacterial membrane, Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certaintyi=0.0000(Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

50 Example 2559 

A DNA sequence (GASx662) was identified in S.pyogenes <SEQ ID 7617> which encodes the amino acid 
sequence <SEQ ID 7618>. Analysis of this protein sequence reveals the following: 

Possible site: 52 
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»> Seems to have no N-terminal signal sequence 

Final Results 

5 bacterial cytoplasm Certainty=0. 3292 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

10 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2560 

A DNA sequence (GASx663) was identified in S.pyogenes <SEQ ID 7619> which encodes the amino acid 
15 sequence <SEQ ID 7620>. Analysis of this protein sequence reveals the following: 

Possible site: 15 

»> Seems to have no N-terminal signal sequence 

20 Final Results 

bacterial cytoplasm — Certaintyi=0. 4867 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

25 No corresponding DNA sequence was identified in iS.flg^fl/acft'ae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted tiiat this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2561 

30 A DNA sequence (GASx664) was identified in S.pyogenes <SEQ ID 7621> which encodes the amino acid 
sequence <SEQ ID 7622>. Analysis of this protein sequence reveals the following: 

Possible site: 46 

>» Seems to have no N-terminal signal sequence 

35 ■ 

Final Results 

bacterial cytoplasm Certainty=0. 2141 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty^: 0.0000 (Not Clear) < suco 

40 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 
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Example 2562 

A DNA sequence (GASx667) was identified in S.pyogenes <SEQ ID 7623> which encodes the amino acid 
sequence <SEQ ID 7624>. Analysis of this protein sequence reveals the following: 

Possible site: 59 

>>> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certaintyi=0. 2614 (Affirmative) < suco 

bacterial membrane — Certainty^O. 0000 (Not Clear) < suco 

bacterial outside — Certainty= 0.0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAF80834 GB:AF165214 Orf78 [Pseudomonas phage D3] 
Identities = 68/200 (34%) , Positives = 109/200 (54%) , Gaps = 10/200 (5%) 



Query: 


12 


GLRFGSLOTINRNimNSKGGNARWNCLCDCGNKTWI-GSKLRSGyTKSCGCaRK^ 70 






GLR G + V ++6+WC CDCGN+ ++ G+ +R+ T SCGC+R + 


Sbjct: 


8 


GLRVGKWV- -EAFSHCAGKASHWVCRCDCGNRVIMRRGHLMRITOTTTSCGCSRFSH- - - 62 


Query: 


71 


GYSSTRLYRIWKGMMNRCYNHKNDNYKYYGGKGISICDEWLTFINFRTWSLSNGYKESLT 130 






G + T Y W M++RC N N Y Y G+GI++C+ W+TF NF G + T 


Sbjct: 


63 


G^ra3TPTYSSWSNMIDRCTOPSNKRYVDYQGRGITVCERWMTFANFLA---DM6ERPDaT 119 


Query: 


131 


- IDRINPRGNYTPIiNCRWSMKMQQNNKTISlNRYLSYLGQEYTIAEPSEKm^^ 189 






+DRI+ Y NCRW + Q NN N ++ YLG+ T+++++ +L + T+ ++ 


Sbjct: 


120 


SLDRIDNDAGYFKENCRWATALEQMNNTRRNTFVEYIXSRRQWSQWAGQLGIPECTLRSR 179 


Query: 


190 


LKLGWSVERIVEEARMKNDR 209 






L GWS+E +++ K R 


Sbjct: 


180 


LNRGWSIEDAMQKPISKQRR 199 


Based on this 


analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 



antigens for vaccines or diagnostics. 



Example 2563 

A DNA sequence (GASx668) was identified in S.pyogenes <SEQ ID 7625> which encodes the amino acid 
sequence <SEQ ID 7626>. Analysis of this protein sequence reveals the following: 

Possible site: 41 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 1476 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CftB75598 GB:AJ271879 putative DNA helicase [uncultured 
eubacterium] 

Identities = 42/168 (25%) , Positives = 75/168 (44%) , Gaps = 7/168 (4%) 

Query: 374 lAGPSKAGKSFALIELSIAIMGQKIOTiG-WQCEQGKVLYVNLEIJDRPSALHRFKDVYDflM 432 

+. P AGKS ++L+ +A G LG + G V+Y+ E D P+A+H A 
Sbjct: 35 LVSPGGAGKSMIALQIAAQIAGGPDLLGVGELPTGPVIYLPAE-DPPTAIHHRIiHALGAH 93 
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Query: . 433 GLPPMIVANIDIWin^GKOTPMDKIiaPKliIRRSLKKI!reC3A---VIIDP 489 

A D ++ + + +LK+ + +I+D + + +EN++ 

Sbjct: 94 LSAEERQftVaOGLLIQPLIGSLPNIMASIWPEALKRAaEGRRmil^TLRRFHIEEENaS 153 

5 

Query: 490 DQMAHFTNQFDKVATELGCSVIYCHHHSKGS--QGGKKSMDRASGSGV 535 

MA + + +A + GCS+++ HH SKG+ G + GS V 

Sbjct: 154 GPMAQVIGRMEAIAADTGCSIVFLHHASKGATMMGAGDQQQASRGSSV 201 

10 Based on this analysis, it was predicted that this, GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2564 

A DNA sequence (GASx669) was identified in S.pyogenes <SEQ ID 7627> which encodes the amino acid 
sequence <SEQ E) 7628>. Analysis of this protein sequence reveals the following; 

15 Possible site: 56 

»> Seems to have no N-tertninal signal sequence 

Final Results 

20 bacterial cytoplasm Certainty=0. 2555 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Cleair) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

25 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2565 

A DNA sequence (GASx670) was identified in S.pyogenes <SEQ ID 7629> which encodes the amino acid 
30 sequence <SEQ ID 7630>. Analysis of this protein sequence reveals the following: 

Possible site: 54 

>» Seems to have no N-terminal signal sequence 

35 Final Results 

bacterial cytoplasm Certainty=0.2921(Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

40 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAP74082 GB:AF212845 0RF129 [Lactococcus lactis bacteriophage 

ul36] 

Identities = 36/108 (33%), Positives = 63/108 (58%), Gaps = 1/108 (0%) 

45 

Query: 8 lEFFLPMDKIPTTTHQQKKVTVINGKPHFYEPESIiKNaRDKFTSLLAQHVPPSKLDGPIR 67 

++F +DK+PTT QQK + + GK FY+ KN K + + + + P++ 

Sbjct: 1 MKFEFELDroWTT-QQQKGIKKVKGKLQFYDimGTKNYSLKAQMCNKPKECFEKl^ 59 

50 Query: 68 LTVKWLFPKIKGSTNGQYKTTKPDTDNLQKLLKDCMTELGFWNDDAQV 115 

L+V + + + Q+KT++PD DNL K L+D MT+L +++DD+Q+ 

Sbjct: 60 LSVTFFYAIKQKKRVmQWKTSRPDLDNLMKNLQDYMTKLRYYSDDSQI 107 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2566 

A DNA sequence (GASx671) was identified in S.pyogenes <SEQ ID 7631> which encodes the amino acid 
5 sequence <SEQ ID 7632>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

»> Seems to have no N-terminal signal sequence 

10 Pinal Results 

bacterial cytoplasm Certainty=0. 42 94 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2567 

20 A DNA sequence (GASx672R) was identified in S.pyogenes <SEQ ID 7633> which encodes the amino acid 
sequence <SEQ ID 7634>. Analysis of this protein sequence reveals the following: , 

Possible site: 15 

>» Seems to have a cleavable N-term signal seq. 
25 , INTEGRAL Likelihood = -6.37 Transmembrane 106 - 122 ( 104 - 125) 

Final Results 

bacterial membrane Certainty=0. 3548 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

30 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
35 antigens for vaccines or diagnostics. 

Example 2568 

A DNA sequence (GASx673) was identified in S.pyogenes <SEQ ID 7635> which encodes the amino acid 
sequence <SEQ ID 7636>. Analysis of this protein sequence reveals the following: 

Possible site: 56 

40 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 4781 (Affirmative) < suco 

45 bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAB18697 GB:U38906 0RF22 [Bacteriophage rlt] 
Identities = 78/207 (37%) , Positives = 123/207 (58%) , Gaps = 2/207 (0%) 

5 Query: 28 EIHRILGIDEVYKAPKRLTDILFDKDSREDIFRQFLKYETDVSYDWFMQYFEEEQADRKN 87 

+ + +L +DE R+ +++FDK RE+ + + L D+ D+F YF A 

Sbjct: 7 QFyDM]am)EHM!IFTNRIQEriVFDKKGREEFYSKIIjNIHHDMGVDFPRDYFMftHSAVSA- 65 

Query: 88 KKQDFTPKSVSTLLSKlISGNQYYEVa.-VGTGGILIQAWQEQRLNDSPFTYRPSKYWYHV 146 

10 K Q +TP + L + ++ G+ ++ GTG ++IQ WQ+ R+N F Y PS YWY 

Sbjct: 66 KGQHYTPDELGKLTALLVGGSGGADLTGAGTGTLIIQKWQDDRMNTDFFNYLPSNYWYQA 125 

Query: 147 EELSDKAVPFLLEMSIRGINGVVVHGDSLTRQVKNIYFLQNTKDDMLSFSDINVMPRTQ 206 
ELSD+A+ FL+ +IRG+NGW+HGD+L VK +YF+QH+ ++ + FS+INV+P ++ 
15 Sbjct: 126 LELSDEAISFLIHAFAIRGRINGVVIHGDALEMAVKQVYFIQNSANNPIGFSEINVIPHSK 185 

Query: 207 DIEREFNVKEWIGDGIEHIENPLIEWI 233 

D + EW IEHIE+ +WI 

Sbjct: 186 DAMEFLGIHEWTEQAIEHIESKFPDWI 212 

20 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2569 

A DNA sequence (GASx674) was identified in S.pyogenes <SEQ ID 7637> which encodes the amino acid 
25 sequence <SEQ ID 7638>. Analysis of this protein sequence reveals the following: 

Possible site: 51 

>>> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -0.00 Transmembrane 122 - 138 ( 122 - 138) 

30 



35 



Final Results 

bacterial membrane Certainty=0 . 1001 (Affirmative) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has hoinology with the following sequences in the GENPEPT database: 

>GP:AAP6307l GB:AF158600 gpl37 [Streptococcus thermqphilus 
bacteriophage .Sf ill] 

40 Identities = 66/135 (48%) , Positives = 89/135 (65%) , Gaps = 2/135 (1%) 

• Query: 5 PEIDIQKTKSNAKRKLREYPRWRRIANDVDTQKVTATYSFEPRQSHGVPSKPVERLAIiNR 64 
PEID + T KRKLREYPRWR lA+D QK+T ++F PR G +KPVE +A+ R 
Sbjct: 4 PEIDEKATLKRCKRKLREYPRWREIAHDSaEQKITQEFTFMPRG--GGVNKPVENIAVRR 61 

45 

Query: 55 VSAEQELDAIEQAVSMILEPERRRILYDKYLAPYKKADKVIYTELCMSESFYYDTLDIAL 124 

V A EL+AIEQAV+ + P+ RRIL +KYLA K+I + + + + L+++ 
Sbjct: 62 VDA1jNE1.EAIEQAVNGLYRPDYRRILIEKYLiAYPPKPNWQIAQSIGFERTAFQELIjNNSI 121 

50 Query: 125 LAFAELYREGVLLVE 139 

lAFAELYR+G L+VE 
Sbjct: 122 LAFAELYRDGRLIVE 136 



55 



Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 
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Example 2570 

A DNA sequence (GASx675) was identified in S. pyogenes <SEQ ID 763 9> which encodes the amino acid 
sequence <SEQ ID 7640>. Analysis of this protein sequence reveals the following: 

Possible site: 41 

>>> Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm — Certainty=0 .1865 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0.0000(Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae.. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2571 

A DNA sequence (GASx676) was identified in S.pyogenes <SEQ ID 7641> which encodes the amino acid 
sequence <SEQ ID 7642>. Analysis of this protein sequence reveals the following: 

Possible site: 31 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 4 870 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>6P:BAB07254 GB:AP001519 unknown [Bacillus halodurans] 
Identities = 194/451 (43%) , Positives = 262/451 (58%) , Gaps = 69/451 (15%) 

Query: 1 MEFVDKKLSEITPYKNNPRNNDEAVGPVAE SIKEFGFKVPIW-DKNGEIVNGHTR 55 ' , 

+ V+KK+ ++ P + NPR + + P E SI+EFG PIV ++ G +V GH R 
Sbjct: 3 IRIVNKKIDDLVPAETOPRLDLQPGDPEYEKLKRSIEEFGLVEPIVFNERTGRWGGHQR 62 

Query: 56 YKAAQKLGI^WPVIVADDLSEEQIKAFRLADljnCV-GEIAVWDIiDLUSIEEI^ 114 
K ++LG E VPV V D L + KA +A NK+ G+ + L L EEL+ L +D++ 
, Sbjct: 63 LKILRELGWEEVFVSVVD-LDDHHEKAIJWAIiNKIEGDWDNFKLKELLEEiiDSGL-IDVT 120 

Query: 115 AFGFDVLDNLDDL IEDEKDL--DDF TGTVPDEPKSKLGDIYQLGSHKLMCG 163 

GFD + ++DL +EDE ++ DDF +EP +K GD++ LG H L+ G 

Sbjct: 121 L.TGFDE-EEIEDLMTQFFVEDENEIKEDDFDPDEVAEEIEEPITKPGDLWHLGRHFLLVG 179 

(Juery: 164 DSTNGADVKKLMNGELADLLLTDPPYNVAYEGKTKDSLTIKNDSMDNDSFRQPLVNA^ 223 

DST DVK+LM E AD++ TDPP'XNV YEG T + IKND+M++ F QFL , +AF + 
Sbjct: ISO DSTKIEDVKRIJMGNEKADMIFTDPPYNVDYEGAT--GMKIKNDN^IEDSEPyQFLFnAFVA 237 

Query: 224 ANEVMKPGAVFYIWHADSEGYNFRGACFDIGWTVRQCLIWNKNSMVLGRQDYHWKHEPCL 283 

+V K G Y+ HADSEG FR A D G+ ++QCLIW KNS+VLGRQDYHW+HEP L 
Sbjct: 238 MYQVTKEGGPIYVCHADSEGIiTFRKAFQDSGFLLKQCLIWVKNSLVLGRQDYHWRHEPIL 297 



Query: 284 YGWKDGAGHLWASDRKQTSVID : 305 

YGWK GA H W RKQ++VI+ 
Sbjct: 298 YGWKI>GAAHKWYGGRKQSWIEDPVDIiAITPKVDHVLLTF]mGISSTVVKVPSYKIIHDG 357 
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Query: 306 YEKPQRNGVHPTMKPVGLFDYQIia®JTKGSDIVLDLFGGSGTTI.IACESNG 356 

E+P+RN HPTMKP+ L I+N++K + VLD FGGSG+TLIACE G 
Sbjct: 358 SDEG^r^TWRIERPKRNaDHPTMKPIALCaRAIQNSSKPGERVLDPFGGSGSTLIACBQTC 417 

5 

Query: 357 RHARLMEYDPKYVDVIIKRWEELTGESVIQL 387 

R +MEYDP Y +VII+RWEE TG++ ++L 
Sbjct: 418 RICHMMEYDPVyAEVIIRRWEEWTGQNAVKL 448 

10 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2572 

A DNA sequence (GASx677) was identified in S.pyogenes <SEQ ID 7643> which encodes the amino acid 
sequence <SEQ ID 7644>. Analysis of this protein sequence reveals the following: 

IS Possible site: 54 

>» Seems to have no N-terminal signal sequence 

Final Results 

20 bacterial cytoplasm Certainty=0. 4744 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

25 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that liiis GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2573 

A DNA sequence (GASx678) was identified in S.pyogenes <SEQ ID 7645> which encodes the amino acid 
30 sequence <SEQ ID 7646>. Analysis of this protein sequence reveals the following: 

Possible site: 31 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -0.27 Transmembrane 90 - 106 ( 90 - 106) 

35 

Final Results 

bacterial membrane Certainty=0 . 1107 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certaintyi=0 . 0000 (Not Clear) < suco 

40 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

45 Example 2574 

A DNA sequence (GASx679) was identified in S.pyogenes <SEQ ID 7647> which encodes the amino acid 
sequence <SEQ ID 7648>. Analysis of this protein sequence reveals the following: 

Possible site: 19 
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»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 3408 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae, 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAA66734 GB:X98106 minor capsid protein [Bacteriophage phigle] 
Identities = 213/494 (43%) , Positives = 323/494 (65%) , Gaps = 19/494 (3%) 



Query: 


1 


MGVIQKIKNLVTRSKyVM-TTQSLTNITDHPKIAISKLEYDRITTNLKYYKSDWDSVLYL 


59 






MG+IQ+IK+L + T SL+ ITD P+I+I RY RI T+L YY + Y 




Sbjct: 


1 


MGLIQRIKDLFWKGftAATGVTGSLSKITDDPRISIDPDEYVRIQTDLDYYSDKLQYIHYQ 


60 


Query: 


50 


OTDGETKKRDRraLPIARTAAKKIASLVENEQAEIKV-DDI»aNEFISETLKNDREK^^ 


118 






+DG KKR N + +A+TAA++IAS+VFNE+AEI V D++ A++F+++ L+++ F F 




Sbjct: 


61 


ASDGIKKKRLKm:iNMAKTAARRIASWFNEKAEIHVKDNNEADKFLM)VLEDNDFK^ 


120 


Query: 


119 


ERYLESCLALGGIiAMRPYVDGDKVRVAFVQAPVFLPLQSNTQDVSSAAWIKSVKTINGK 


178 






E LE +ALGG AMRPY+DG+ +++A+V+A F PLQSNT D+S AA+ ++ +T + + 




Sb j ct : 


121 


EElftLEKGVaLGGFMIRPYIDCaJHIKIAWVEJiDQFYPI^JSiraroiSEftAIASRTQRTESN^ 


180 


Query: 


179 


EVYYTLIEFHEWQSSDDYVISNELYRSDDKAKVGSRVPLS - -EVYKDLKDEAKOT0VTRP 


236 






YYTL+EFH+WQ + Y I+NELY+SD VG++VPLS VYK+L + ++ + RP 




Sb j ct : 


181 


TKYYTLLEFHQWQnNGSYQITNELYKSDSPDIVGNQVPLSTLPVYKELAPQVTISGLQRP 


240 


Query: 


237 


IFTYLKTPG^™KDINSPLGLSIFDNAKTTIDFI1SITTYDEFMWEVKMGQRRVAVPESLTA 


296 






+F Y KTPG UN +1 SPLGL + DNftK +D IN T+D+F+WE+++GQ+ +AV + 




Sbjct: 


241 


LFAYFKTPGftNNINIESPLGLGVVDNaKHVLDDINDTHDQFIWEIRLGQKHIAVQPGMLR 


300 


Query: 


297 


LTVRTADGDWPRPRFESDQNVYIRMGGRDLDSSAIQDLTTPIRADDYIKAINEGLSLFE 


356 






D +P F+++QNVY+ + D + ++D+TTPIR Y AI+ + FE 




Sbjct: 


301 


F DDEHKPTFDTEQNVYVGVLSDDNNGLGVKDMTTPIRTVQYKDAIDHFIKEFE 


353 


Query: 


357 


MQIGVSAGLFSFDGKSMKTATEIVSENSDTYQMRNSIVTLVEQSLKELVISIFEIAKAYD 


416 






+QIG+S G FS+ +KTATE+VS NS TYQ R+S +T+VE+++ EL SIFE+A A 




Sb j ct : 


354 


VQIGLSTGTFSYSNDGVKTATEWSNNSMTYQTRSSYLTMVEKAIDELCQSIFELANAGA 


413 


(Juery: 


417 


LYQSEVP- -SMDNISISL DDGVFTDRDAEIJDYWIKVVNRGFGTREMAIQKVLNV 


468 






L+ P ++D+ S L DDGVF ++D +L+ KV+ G +++ +Q+ + 




Sbjct: 


414 


LFDroKPLFTLDSASQPLDIECHFDDGVFVNKDKQLEEDAKVLAIGaLSRQTFLQRNYGM 


473 


(Juery: 


469 


TEEKAQEIAAEINT 482 








T+E+A E A+I + 




Sbjct: 


474 


TDEQAAEELAKIQS 487 





Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2575 

A DNA sequence (GASx680) was identified in S.pyogenes <SEQ ID 7649> which encodes the amino acid 
sequence <SEQ ID 7650>. Analysis of iJiis protein sequence reveals the following: 

Possible site: 48 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty^O .1840 (Affirmative) <: suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 
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bacterial outside Certainty=0 . 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

5 >GP:CAB53790 GB:AJ242593 gp4 [Bacteriophage A118] 

Identities = 114/385 (29%) , Positives = 187/385 (47%) , Gaps = 23/385 (5%) 

Queiy: 8 IM3EQLLLEASQLSDMYHQLTrj3LFDQVIERIKARGSASLftDNmiWQANKLHDVGLLNA 67 
L QL B + D+Y L +LF ++ R+K + + S AUN WQ KL+ V L+ 
10 Sbjct: 3 LTPRQLDLFVQPIVDVYTGLENELFTLIVRRLKTKKNIS-MJimiAWQIEKm^ 61 

Query: 68 DNIKLIAKYSGIAEAQLRYIIKNEGFKIYKNTSEQLEEALGRESGV NSTIQDD 120 

1+ I+K SG++ +L ++K+ G+ K + E+G TI D 

Sbjct: 62 QMIERISKASGVSAKKLFSVVKDAGYSDLKjQtVnNYFSKIA--EAGAVLPL^^ 119 

15 

Query: 121 LSNYARQAIDDVHNLTNTTLPPSVIGRYQGIIQiavaGVVTGLKTPDQAINQ^ 180 

+ + + + N T+ Y II + V+ GLKT QA+ +TV K+ + 

Sbjct: 120 VMRSYFKLAESiraRINQTMLSQRRQIYSDIIHETrQSVLAGLKTHRQftLRETTO 179 

20 Query: 181 GFYGFTDKRGRKWRADSYARWINTTTVTOVFNEAKEAPAREFGIDTFYYSKKATAREMCA 240 

G DKA ++W ++Y RTV TT V+N ++ E+G+D S+ AR C+ 

Sbjct: 180 GVPALVDKANKRWTPEAYVRTVTRTTVNSVYMSVEDERMNEYGVDLVRISQHVGARPTCS 239 

Query: 241 PLQHQIV---TTGEAREEGGIKIIALSD YGHGEPDGCLGINCKHTKTPFWGVNSK 293 

25 +Q +++ + E R + G K +++ YG+G DG G NC+H + F+ 6+N 

Sbjct: 240 IVQGKVIci:iLSVEETRSKYGIJKYMSIYSPELRYGYG--DGIFGCNCRHHRFAFIEGINIA 297 

Query: 294 PELPEHLKNITPAQAKANANAQAKQRAIERSIRKSKELLHVAKQLGDKELIRQYQSDVRS 353 
P+ E I+K +QR +ER IR +K L A++LGD+ +++ + VR+ 

30 Sbjct: 298 PDESE- - -LIDEEENKRVYALSQQQRIJ1ERDIRAAKRKLSAAEELGDELRVKKAKQAVRT 354 

Query: 354 KQDALNYLINNNAFLHRNQAREKRY 378 

KQ Ii + + li R +REK Y 
Sbjct: 355 RQSKLRAFVKTHN-LTRQYSREKVY 378 

35 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2576 

A DNA sequence (GASx681) was identified in S.pyogenes <SEQ ED 7651> which encodes the amino acid 
40 sequence <SEQ ID 7652>. Analysis of this protein sequence reveals the following: 

Possible site: 31 



»> Seems to have no N-terminal signal sequence 

45 Final Results 

bacterial cytoplasm Certainty=0 . 2756 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

50 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefial 
antigens for vaccines or diagnostics. 
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Example 2577 

A DNA sequence (GASx682) was identified in S.pyogenes <SEQ ID 7653> which encodes the amino acid 
sequence <SEQ ID 7654>: 

TLDNQSVIKAIGDTVDYIKKNYKRKWGK 

Analysis of this protein sequence reveals the following: 

Possible site: 25 

»> Seems to have a cleavable N-term signal seq. 

Final Results 

bacterial outside Certainty=0. 3000 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2578 

A DNA sequence (GASx683) \yas identified in S.pyogenes <SEQ ID 7655> which encodes the amino acid 
sequence <SEQ ID 7656>. Analysis of this protein sequence reveals the following: 

Possible site: 60 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 5288 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2579 

A DNA sequence (GASx685) was identified in S.pyogenes <SEQ ID 7657> which encodes the amino acid 
sequence <SEQ ID 7658>: 

GaTEVGaNRVVSGVYGEVLGVQIVRSRKCPKBTA™VRKiGaI«IMLKRNT^WETDRDITKAINQIV;^^ 
K 

Analysis of this protein sequence reveals the following: 

Possible site: 18 

>>> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 1750 (Affirmative) < suco 
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bacterial membrane — Certainty=0 . 0000 (Not Clear) < suco 
bacterial outside — Certaiiity=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

5 The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAA.59185 GB:X84706 major head protein [Bacteriophage HI] 
Identities = 138/270 (51%) , Positives = 186/270 (68%) , Gaps = 6/270 (2%) 



(Juery: 


1 










M+ T +A +++PEVLA ++ E+ Kft.+RFAPIjA+vDTTL+GQPG TL P + YIGD 




Sbjct: 


1 


MSKQKTTLADLVNPEOTATIVSYEUIKMiRFAPtAQTOTTLQGQPGNTLKPPDPB^ 


60 


Query: 


60 


AEDVAEGEAIPMTQLGFKKTTMTIKKAGKGVEITDEAILSGYGDPVGQAAKQIVEAIDHK 


119 






A DVAEG I + ++G ++TIKKA KG EITDEA LSGYGDP+G++ KQ+ ++ +K 




Sbjct: 


61 


AADVJffiGGEISLDKIGTTTKSVTIKKAAKGTEITDEAALSGYGDPIGESNKQLGLSLftNK 


120 


Query: 


120 


VnADVIJ^ALSKSTQTVEa.TATVDGVSKaLDIBm)EDDAEWIV^^ 


179 






VD D+L A ++QTV A VDGV ALDIENDED V+++NP DA+ +R UA + 




Sbjct: 


121 


VDDDLLSAaKTTSQWSTKaNVDGVeaALDIEmjEIJAQaYVLIVNPKnaAKIR^^ 


180 


Query: 


180 


LGATEVGANRWSGVYGEVLGVQIVRSRKCPKGTAYMVR KQALRIMLKRNTMVETD 


235 






+G +EVGRN +++G Y +VLG QIVRS+K +G+A M + AL+++LKR VETO 




Sbjct: 


181 


IG-SEVGANALINGTYADVLQAQIWSKKIAEGSRLMFKIVSNSPALKLVLKRGVQVETD 


239 


Query: 


236 


RDITKAINQIVANKHYGVYLYKAEKAVKIT 265 








RDI I A++HY YLY K V IT 




Sbjct: 


240 


RDIVTKTTVITADEHYAAYLYDLTKWNIT 269 




Based on this 


analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 



30 antigens for vaccines or diagnostics. 
Example 2580 

A DNA sequence (GASx686) was identified in S.pyogenes <SEQ ID 7659> which encodes the amino acid 
sequence <SEQ ID 7660>. Analysis of this protein sequence reveals the following: 

Possible site: 35 

35 ■ 

»> Seems to have an uncleavable N-term signal seq 

Final Results 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

40 bacterial outside Certainty= 0.0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty^O. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

45 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2581 

A DNA sequence (GASx687) was identified in S.pyogenes <SEQ ID 766 1> which encodes the amino acid 
sequence <SEQ ID 7662>. Analysis of this protein sequence reveals the following: 

50 Possible site: 54 

»> Seems to have no N-tezminal signal sequence 
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Final Results 

bacterial cytoplasm Certainty=0 . 2942 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside CertaintY=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 



10 Example 2S82 

A DNA sequence (GASx688) was identified in S.pyogenes <SEQ ID 7663> which encodes the amino acid 
sequence <SEQ ID 7664>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

15 >>> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certaintyi=0. 2844 (Affirmative) < suco 
bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 
20 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAC00538 GB:L02496 lanknovm protein [Bacteriophage LL-H] 
25 Identities = 35/86 (40%) , Positives = 48/86 (55%) , Gaps = 6/86 (6%) 

Query: 24 KliININNQVmSmPYVPYRDGKLRGSSRANSVGVTWSGPHARiiQFYGGA'mCYKSFKFK^ 83 

+L + NQ+ M YVP R G LR S N G+ ++ +ARAQFYG + + 

Sbjct: 20 RLQVI^QMHQD^ffiQYVPKRftGFIJRSQSFTODTGIHYTAKyM^ftQFYGFV NGHRVRN 75 

30 

Query: 84 YTTPGTGKRWDKRALANATIVKDWEK 109 

y+TPGTG+RWD + A A DW+K 
Sbjct: 76 YSTPGTGRRWDLK--AKAVYKftDWQK 99 

35 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2583 

A DNA sequence (GASx689) was identified in S.pyogenes <SEQ ID 7665> which encodes the amino acid 
sequence <SEQ ID 7666>. Analysis of this protem sequence reveals the following: 

40 Possible site: 45 

»> Seems to have no N-terminal signal sequence 

Final Results 

45 bacterial cytoplasm Certainty=0. 2892 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0 . 0000 (Not Clear) < suco 



50 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAA66741 GB:X98106 minor capsid protein [Bacteriophage phigle] 
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Identities = 36/109 (33%) , Positives = 64/109 (58%) , Gaps = 2/109 (1%) 

Query: 17 DLGIKPRLDYLTRQEDIAIYPMPGGKmffiYMDGTREISLPFEIAIKTKNQEIAS'IWlWT 76 

+L +K L YLT + L++YP+PG +V +E G ++ + +E+ ++TKNQ+ A+T +W 
Sbjct: 16 NLPMKCTLGYLTAftDSLSLYPLPGSRVLDEDYAGNQQWQMNYWGmTKNQQQAOTTLV^ 75 

Query: 77 INSALSNPDL-KLPSIiNHSYTFISLDVE-KPPIiNDLSDQGFYIYVLDIT 123 

++ AL L S N S+ F SL + +P +++ QG+ Y L + 

Sbjct: 76 VSQALDVLTADDLVSSNQSFEFESLTINGQPSISEQDTQGYSTYQLSFS 124 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigeps for vaccines or diagnostics. 

Example 2584 

A DNA sequence (GASx690) was identified in S.pyogenes <SEQ ID 1661> which encodes the amino acid 
15 sequence <SEQ ID 7668>. Analysis of this protein sequence reveals the following: 

Possible site: 18 

>» Seems to have no N-terminal signal sequence 

20 Pinal Results 

bacterial cytoplasm Certainty=0. 1626 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside CertaintyssQ. 0000 (Not Clear) < suco 

25 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CRB53798 GB:AJ242593 major tail shaft protein [Bacteriophage 

A118] 

Identities = 54/133 (40%) , Positives = 77/133 (57%) , Gaps = 9/133 (6%) 

30 

Query: 1 MRQKNALRGHFIAPYVKGEEKTEVTKEKLLELARWIKDISDDTDEKTEDEAYYDGDGTEE 60 

MR KNA + +A V G + + + L++WI ++SDD + TE++ YDQDG E+ 

Sbjct: 1 MRIKNARTKYSVAElVAiGAGEPDWKR LSKWITNVSDDGSDNTEEQGDYDGDGNEK 55 

35 Query: 61 TTWGVKGAYTFEGTYDPEDKAQAHIASLKYKLGDERKVVfflLIVSAIXSKTQlAniiGVAT^ 120 

T V+G AYTFEGT+D ED+AQ I + K++R+ I D+T+G ATV+E 
Sbjct: 56 TWLGYSEAYTFEGTHDREDEAQNLIVA-KRRTPENRSIMFKIEIPDTETA-IGKATVSE 113 

Query: 121 I--IAGSQAAARF 131 
40 I AG G A F 

Sbjct: 114 IKGSAGGGDATEF 126 , 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

45 Example 2585 

A DNA sequence (GASx691) was identified in S.pyogenes <SEQ ID 1669> which encodes the amino acid 
sequence <SEQ ID 7670>. Analysis of this protein sequeaace reveals the following: 

Possible site: 17 

50 >» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 3521 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < succ> 

55 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

5 Example 2586 

A DNA sequence (GASx692) was identified in S.pyogenes <SEQ ID 7671> which encodes the amino acid 
sequence <SEQ ID 7672>. Analysis of this protein sequence reveals the following: 

Possible site: 61 

10 >» Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 3438 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

15 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CaB53801 GB:AJ242593 gpl5 [Bacteriophage A118] 
20 Identities = 67/191 (35%) , Positives = 110/191 (57%) , Gaps = 17/191 (8%) 

Query: 11 FEFRGEIYPIDLSFNKVLDVFDVIDDDFLNEAEKCFLCLDILLDRTDLPFTYAVD 65 

+E+ G+ Y +DL+F+ VL V D+ +D+ L++ + L +D+L D+P+ + + 
Sbjct: 12 YEYEGKEYKLDLAFDNVLRVIDLTEDNSLSDVFRANLAIDVLF-ADDMPWPRSNEEDEYA 70 

25 

Query: 66 LWVyiKIOTIDAERPEKPQIiDIKGNPMPVVKEKEDNKKVI---DLSIiDaEFIY 115 

+ + I TO+I E + DI GN MP D+ + I L+ DA++IY 

Sbjct: 71 NIEEKSLVLIDIFTNYIVKENDDGLLYDIDGNKMPSATNMNDDAEEIASYSLTQDADYIY 130 

30 Query: 116 ASFRQAYQINLLKEQNRLSWIEFKALLNALPDDTVMQRIIAIRQWE-DDGEGSKKYRDNM 174 

ASF Q Y I+IiL + ++ W +F+ALL +L DDT ++ II IRQ E G+G++K R+ + 
Sbjct: 131 ASFLQDYNIDLLDSRGKMHWYKFRALLESLRDDTTIKTIIGIRQAELPSGKBTEKEKISIEL 190 

Query: 175 RKLKAKYSLDE 185 
35 KLK +Y L + 

Sbjct: 191 IKLKNRYKLKD 201 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

40 Example 2587 

A DNA sequence (GASx694) was identified in S.pyogenes <SEQ ID 7673> which encodes the amino acid 
sequence <SEQ ID 7674>. Analysis of this protein sequence reveals the following: 

Possible site: 29 

45 >» Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 .4143 (Affirmative) < suco 

bacterial membrane Certainty=O.OOoo (Not Clear) < suco 

50 bacterial outside Certainty=0. 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has homology with the following sequences in the GENPEPT database: 

>GP:AaG18639 GB:Ay007505 unknown [Streptococcus mitis] 
Identities = 48/157 (30%) , Positives = 85/157 (53%) , Gaps = 10/157 (6%) 

5 Query: 86 DLELSWEPDYIYKATHITPFSIKEVLRNFGRLKINFLIHPIKYLKTGKQEVPLVNG-GTL 144 

+LE S+ P+ ++ A H S K + +LKI + P +Y KT E NG GT+ 
Sbjct: 81 ELEFSYHPESVPYA-HPLTASYKPFG^raAWQLKIK]amQPFRYQKTVNPES--raGPGTI 137 

Query: 145 QNPGlWQAKPILKIKGTGNGILTINDFETGIiENVQSELVIDMERHLVYKDVLSAWDNIVR 204 

10 NPG + ++PI++++G G+ +TI ET NV+++ ID + +++ +A + 

Sbjct: 138 NNPGTIYSEPIIEVQGDGDVSITIGR-ETHfYLNVKTKATIDCRQG--RQNIYNATGAVQN 194 

Query: 205 TERHRMPLFDV--GQNKISWTGS-FTITAVPHWGVKV 238 
T R R F++ G++ I++TG+ + PNW K+ 
15 Sbjct: 195 TIJlKRGGFFEIPTGRSGITFTGNVIiRLIIRPNWRYKI 231 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2588 

20 A DNA sequence (GASx695R) was identified in S.pyogenes <SEQ ID 7675> which encodes the amino acid 
sequence <SEQ ID 1616>. Analysis of this protein sequence reveals the following: 

Possible site: 15 ■ 

»> Seems to have no N-tertninal signal sequence 
25 INTEGRAL Likelihood = -2.60 Transmembrane 15 - , 31 ( 15 ■ - 31) 

Final Results 

bacterial membrane Certainty=0. 2041 (Affirmative) < suco 

bacterial outside Certaintyi=0.0000(Not Clear) < succ> 

30 bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
35 antigens for vaccines or diagnostics. 

Example 2589 

A DNA sequence (GASx697) was identified in S.pyogenes <SEQ ID 7677> which encodes the amino acid 
sequence <SEQ ID 1618>. Analysis of this protein sequence reveals the following: 

Possible site: 22 

40 

»> Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 3348 (Affirmative) < suco 

45 bacterial membrane Certaintyi=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not clear) < suco 

No correspondiag DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

50 >GP:AAA86895 GB:U28144 hyaluronidase [Streptococcus pyogenes] 

Identities = 326/337 (96%) , Positives = 329/337 (96%) 
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Query: 1 MSEINIPLRVQFKRMKRAEWARSDVILLESEIGFETDTGFARAGDGHNRFSDLGYISPLDY 60 

MSENIPLRVQFKRMKAAEWARSDVILLESEIGFETDTGFARAGDGHNRFSDLGYISPLDY 
Sbjct: 1 MSENIPLRVQFKRMKAAEWARSDVILLESEIGFETDTGFAEJAGDGHNRFSDLGYISPLDY 60 

5 Query: 61 NLLTNKPNIDGLA.TKVETAQKLQQKADKETTOTK3ffiSKQELDKKIJ)KiKGGVOT 120 

NIiLTNKPNIDGLATKVETAQBIiQQKaDKETVYTKaESRQELDK^^ 
Sbjct: 61 lSnJLTNKENIDGIATKVETi«3KLQQKADKETVYTKaESKQEIjDK2am.KGG^ 120 

Query: 121 AATVAYSSSTGGAWIDLSSTRGAGVWYSDNDTSDGPLMSLRTGKETFNQSALFVDYKG 180 
10 AATVAYSSSTGGAUNXDLSSTRGAGWVYSDNDTSDGPLMSLRTGKETFNQSALFVDYKG 

Sbjct: 121 AATVAYSSSTGGAWIDLSSTRGAGVVVYSDNDTSDGPLMSLRTGKETTOQSALFVDYK^ 180 

Query: 181 TTNAVNIAMRQPTTENFSSAIiNITSGNENGSAMQLRGSEKAI^TO 240 
TTNAVNIAMR TTENFSSALNITSGNENGSAMQLRGSEKALGTLKITHENPSIGADYDK 
15 Sbjct: 181 TTmVNIAMRHATTPNFSSALNITSGNENGSAMQLRGSEKALGTLKIT^^ 240 

Query: 241 NAAALSIDIVKKTNGAGTAAQGIYINSTSGTTGKLLRIRNLSDDKFYVKSDGGFYAKETS 300 

NAA + + K+ NGAGTAAQGIYINSTSGTTGKLLRIRMLSDDKFYVKSDGGFYAKETS 
Sbjct: 241 NAARYPLILSKRQNGAGTAAQGIYINSTSGTTGKLLRIRNLSDDKFYVKSDGGFYAKETS 300 

20 

Query: 301 QIDGNLKLKDPTANDHAATKAYVDKAISELKKLILKK 337 

QIDGNLKLKDPTAHDHRATKAYVDKAISELKKLILKK 
Sbjct: 301 QIDGNLKLKDPTANDHAATKAYVDKAISELKKLILKK 337 

25 Based on this analysis, it. was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2590 

A DNA sequence (GASx698) was identified in S.pyogenes <SEQ ID 7679> which encodes the amino acid 
sequence <SEQ ID 7680>. Analysis of this protein sequence reveals tlie following: 
30 Possible site: 17 

:>» Seems to have no N-terminal signal sequence 

Final Results 

35 bacterial cytoplasm Certainty=0. 4208 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

RGD motif 54-56 

40 No corresponding DNA sequence was identified in jS. ag^a/ac/Zae. . 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AftA98102 GB:M19348 ORF [Streptococcus pyogenes phage H4489A] 
Identities = 250/648 (38%), Positives = 351/648 (53%), Gaps = 75/648 (11%) 

45 Query: 1 MSRDPTLILDESNLVIGKDGRVHYTFTTEDDNPKVRLASKCLGTAHFNQLMIERGDQATS 60 

MSRDPT ++E +L DGR + TF + + VRL S CLG +L +E + 

Sbjct: 1 MSRDPTYTINEHDLSFA-DGRFYOTFKADKSSETTOLNSSCLGNTIIKKIiQVEDDOTMHD 59 

Query: 61 YVAPVVVEGTGNPTGLFKDLKEISLELTDTANSQLWSKIKIiTISIRGMLQEYYDGKIKTEIV 120 
50 +V P V T GL + +KE+ L+L D S LW KIK N+ ML EY + ++ + I 

Sbjct: 60 FVia>KVT--TQQRFGIAQQVKELDLQLKI)P-KSDLWGKIICENNKaMLVEyAl«CEMSSAIA 116 

Query: 121 NSARGVATRISEDTDKKIALINDTIDGIRREYRDADRKLSASYQaGIEGLKATMSNDKIG 180 
SA + ++ D++ + T++GI++ + 
55 Sbjct: 117 QSaEQILLQVKSIDDERYSKFEQTIiWGIKQTVKSES— 152 

Query: 181 LQAEIKASAQGLSQKYDDELRKLSAKITTTSSGTTEAYESKLAGLRAEPTRSNQGTRTEL 240 

++++ L+ +D + L K + S T ++ S+L G + L 

Sbjct: 153 VESRRTQLASMPDSRISGLD6KYSRLSQ-TIDSLSSRLD -DGVGNYSTL 199 



60 



Query: 241 ESQISGLRAVQQSTASQISQEIRDREGAVSRVQQSLESYQRRMQDAEENYSSLTHTVRGL 300 
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++SG I + + VSR+ Q+ + Q ++ +A +NYSSL+ TV+GL 




Sb j ot : 


200 


SQKVSG IDLRVSNaaNDVSRLSQTAQGLQSQITNANQNYSSLSQTVQGL 


248 


Query: 


301 


QSDVGSPTGKIQSRLTQIiAGQIEQRVTRDGVMSirSGAGDSIKIAIQKAGGINAKMSGNE 


360 






Q+ V SR+ QL+ I +VT+ V + 1+ + D I AI+ + KM+G+E 




Sbjct: 


249 


QTTVRDNQSNATSRINQLSDLISTKVTHGDVETTIAQSYDKIAFAIRDKLPAS-KMTGSE 


307 


Query: 


361 


IISAimjSISYG\7TIAGKHIALDGNTTVNGTFTTKIAEAIKIRADQIIAGTIDAftRIRVIN 


420 






IISAINL+ GV I GK+I LDGM+ ++ K A + A +1 G ++A+RI 




Sb j ct : 


308 


IISAINLDRSGVKITGKNITLDGNSYISNA-VIKDAHIANMDAGKINTGYliNASRIAAKA 


366 


Query: 


421 


LNASSIVGLDANFIK- -AKIGY AIT- - -DLLEGKVIKARNGftMLI 


460 






+ I A P K A GY A+T + G V+ A NGA 




Sbjct: 


367 


ITGDKIKMJYAFENKLTAKEEYFRTLFAKNIFTTSVQAVTTSASKITGGVLSATN^ 


426 


Query: 


461 


DLNTAKMDFNSDATINFNSKNNALVRKDGTHTAFVHFSNATPKGYTGSALYASIGITSSG 


520 






DLN+A +DFN nATINFNSKNNALVRK GT+TAFVHFSNATPKGY GSALYASIGITSSG 




Sbjct: 


427 


DIiWSaNIDENRDATINENSKNNaiiTOKSGTNTAFVHFSNATPKGYRGSALYaSIGITSSG 


486 


Query: 


521 


DGVNSASSGRFAGIiRSFRYATGYNHTAAVDQTEIYGDNVLVVDDFNITRGFKFRPDKMQK 


580 






DG++SASSGRF G+R FRiA G HTA VDQ EIYGD+++ DDFNI RGFK RF M K 




Sbjct: 


487 


DGIDSASSGRFCGVRPFRYAEGLQHTAKVDQAEIYGDDIVFSDDFNIDRGFKMRPSIjMPK 


546 


Query: 


581 


l^MNDLYAAVVALGRCMGHLANVGVm'AHSNFTSAVNREU^ 628 








M+D+N +Y A++ALGRCW H N W+ + + SA+ E N +1 + 




Sbjct: 


547 


MTOIiNKMYQAILALGRCWLHANm'AWSW-NFDTRSAIIAEYNaHira^ 593 




Based on this 


analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 



antigens for vaccines or diagnostics. 



Example 2591 

A DNA sequence (GASx699) was identified in S.pyogenes <SEQ ID 7681> which encodes the amino acid 
sequence <SEQ ID 7682>. Analysis of this protein sequence reveals the following: 

Possible site: 36 

»-> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 3323 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2592 

A DNA sequence (GASx701) was identified in S.pyogenes <SEQ ID 7683> which encodes the amino acid 
sequence <SEQ ID 7684>. Analysis of this protein sequence reveals the following: 

Possible site: 20 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 1017 (Affirmative) < suco 

bacterial membrane Certainty^O . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useiul 
5 antigens for vaccines or diagnostics. 

Example 2593 

A DNA sequence (GASx702) was identified in S.pyogenes <SEQ ID 7685> which encodes tiie amino acid 
sequence <SEQ ID 7686>. Analysis of this protein sequence reveals the following: 

Possible site: 27 

10 

»> Seems to have an lancleavable N-term signal seq 

INTEGRAL Likelihood = -3.03 Transmembrane 2 - 18 ( 1 - 23) 

Pinal Results 

15 bacterial membrane Certainty=0. 2211 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm — CertaintyisQ. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in iS.aga/acft'ae. 

20 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2594 

A DNA sequence (GASx703) was identified in S.pyogenes <SEQ ID 7687> which encodes the amino acid 
25 sequence <SEQ ID 7688>. Analysis of this protein sequence reveals the following: 

Possible site: 25 

>>> Seems to have a cleavable N-term signal seq. 

INTEGRAL Likelihood = -3.45 Transmetribrane 36 - 52 ( 36 - 55) 

30 



Certainty=0. 2381 (Affirmative) < suco 

Certainty=0 . 0000 (Not Clear) < suco 
Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

=.GP:AAC39287 GB:AF115103 orfSV gp [Streptococoua thermophilus 
bacteriophage Sfi21] 
40 Identities = 43/73 (58%) , Positives = 61/73 (82%) 

Query: 1 MINLKLRLQNKOTLMAILGAIFLLAQQLGIKLPSNIADIANTAVTLLVLLGVVTDPTTKG 60 

MIN KLRLQNK TL+A++ A+FL+ QQ G+ +P+NI + NT V +LV+LG++TDPTTKG 
Sbjct: 8 MINFKLRLQNKaTLVALISAVFLMLQQFGLHVPNNIQEGINTLVGILVILGIlTDPTTKG 67 

45 

Query: 61 LSDSEQALTYHEP 73 

++DSE+AL+Y +P 
Sbjct: 68 lADSERALSYIQP 80 



— ■■ — Final Results 

bacterial membrane 

bacterial outside 
bacterial cytoplasm 

35 
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10 



15 



25 



30 



Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usellil 
antigens for vaccines or diagnostics. 

Example 2595 

A DNA sequence (GASx707R) was identified in S.pyogenes <SEQ ID 7689> which encodes the amino acid 
sequence <SEQ ID 7690>. Analysis of this protein sequence reveals the following: 

Possible site: 22 

>>> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood =-10.35 Transmembrane 9 - 25 ( 1 - 27) 

Final Results 

bacterial membrane — Certaintyi=0. 5140 (Affirmative) < succ> 

bacterial outside — Certainty=0.0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, covild be useful 
antigens for vaccines or diagnostics. 

Example 2596 

20 A DNA sequence (GASx714R) was identified in S.pyogenes <SEQ ID 769 1> which encodes the amino acid 
sequence <SEQ ID 7692>. Analysis of this protein sequence reveals the following: 

Possible site: 26 



>» Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm — Certainty=0. 1401 (Affirmative) < suco 

bacterial membrane — Certainty=0.0000(Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

35 Example 2597 

A DNA sequence (GASx715) was identified in S.pyogenes <SEQ ID 7693> which encodes the amino acid 
sequence <SEQ ID 7694>. Analysis of this protein sequence reveals the following: 

Possible site: 20 

40 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 0417 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

45 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 
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15 



Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2598 

A DNA sequence (GASx726) was identified in S.pyogenes <SEQ ID 7695> which encodes the amino acid 
sequence <SEQ ID 7696>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

>>> Seems to have no N- terminal signal sequence 

lOTEGRAL Likelihood = -1.17 Transmembrane 18 - 34 ( 18 - 35) 



Final Results 

bacterial membrane Certainty=0 . 1468 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) ,< suco 

bacterial cytoplasm — Certaintyi=0 . 0000 (Wot Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, covld be useful 
antigens for vaccines or diagnostics. 

20 Example 2599 

A DNA sequence (GASx728R) was identified in S.pyogenes <SEQ ID 7697> which encodes the amino acid 
sequence <SEQ ID 7698>. Analysis of this protein sequence reveals the following: 

Possible site: 29 

25 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 1795 (Affirmative) < suco, 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

30 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences ia the GENPEPT database: 

>GP:AAF61314 GB:U96166 unknovm [Streptococcus cristatus] 
35 Identities = 149/194 (76%) , Positives = 162/194 (82%) 

Query: 1 LSAIIRQSTSKRISDKRGIYLVEKLVSLAKQSYFTVTKTSPMIEEVRYYRKELLRLSERR 60 • ,, 

L IIRQSTSKRIS+KR YL +KL+ LAKQS+ V KTSPM+EEVRYYA+ELLRLSERR ■ 
Sbjct: 56 LYEIIRQSTSKRISEKRIAYLTDKLIKLAKQSFCAVKKTSPMLEEWYYAQELLRLSERR 115 

40 

Query: 61 QAIFDKMVASAQPLPEDKILRSIPSIVETTATSIIGELGAIRRFQSANQINAFIGIDFRH 120 

Q + + MVA AQPLPE ILRSIP I ETTATSIIGELG I RFQS NQ NAFIGID RH 
Sbjct: 116 QVVLNDMVALAQPLPEYDILRSIPGIAETTATSIIGELGDIHRFQSTNQFNAFIGIDLRH 175 

45 Query: 121 YESGNYLAQEHITKRGNPYAPKIIiFKCIHDIAFASHTNPCHIADFYEKRKRQSQTASTKP 180 

YES N+LA+EHITKRGNPYA KILFKCIH+IA ASHTNPCHIADFYEKRKRQS ASTKP 
Sbjct: 176 YESRNFIAKEHITKRGNPYARKILFKCIHNIASASHTNPCHIADFYEKRKRQSTIASTKP 235 

Query: 181 HTIASRHCLVRQCF 194 
50 TIAS H L+R + 

Sbjct: 236 LTIASIHRLIRTMY 249 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2600 

A DNA sequence (GASx729R) was identified in S.pyogenes <SEQ ID 7699> which encodes the amino acid 
5 sequence <SEQ ID 7700>. Analysis of this protein sequence reveals the following: 

Possible site: 28 

»> Seems to have no N-terminal signal sequence 

10 Final Results 

bacterial cytoplasm — Certainty=0. 2363 (Affirmative) < suco 

bacterial membrane Certainty=0.0000(Not Clear) < suco 

bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified in 5. ag^a/flcft'ae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2601 

20 A DNA sequence (GASx730R) was identified in S.pyogenes <SEQ ID 7701> which encodes the amino acid 
sequence <SEQ ID 7702>. Analysis of this protein sequence reveals the following: 

Possible site: 25 

»> Seems to have an uncleavable N-terra signal seq 

25 

Final Results 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0 . 0000 (Not Clear) < suco 
bacterial cytoplasm Certainty=0.0000(Not Clear) < suco 

30 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

35 Example 2602 

A DNA sequence (GASx734) was identified in S.pyogenes <SEQ ID 7703> which encodes the amino acid 
sequence <SEQ ID 7704>. Analysis of this protein sequence reveals flie following: 

Possible site: 52 

40 »> Seems to have no N-terrainal signal sequence 

Final Results 

bacterial cytoplasm Certainty^O. 4001 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

45 bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2603 

A DNA sequence (GASx735) was identified in S.pyogenes <SEQ ID 7705> which encodes the amino acid 
sequence <SEQ ID 7706>. Analysis of this protein sequence reveals the foUowmg: 

Possible site: 55 

>>> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -3.66 Transmembrane 276 - 292 ( 274 - 292} 



Final Results 

bacterial membrane — Certainty=0. 2466 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2604 

20 A DNA sequence (GASx736) was identified in S.pyogenes <SEQ ID 7707> which encodes the amino acid 
sequence <SEQ ID 7708>. Analysis of this protein sequence reveals the following: 

Possible site: 33 



>>> Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm — Certainty=0. 3998 (Affirmative) < suco 

bacterial membrane — Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

35 Example 2605 

A DNA sequence (GASx737) was identified in S.pyogenes <SEQ ID 7709> which encodes the amino acid 
sequence <SEQ ID 7710>. Analysis of this protein sequence reveals the following: 

Possible site: 60 

40 »> Seems to have a cleavable N-term signal seq. 

INTEGRAL Likelihood =-12.74 Transmembrane 77 - 93 ( 69 - 99) 

INTEGRAL Likelihood = -4.14 Transmembrane 152 - 168 ( 151 - 170) 

INTEGRAL Likelihood = -1.17 Transmembrane 196 - 212 ( 194 - 212) 

45 Final Results 

bacterial membrane Certainty=0 . 6095 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0 . 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted fliat this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

5 Example 2606 

A DNA sequence (GASx738) was identified in S.pyogenes <SEQ ID 771 1> which encodes the amino acid 
sequence <SEQ ID 7712>. Analysis of this protein sequence reveals the following: 

Possible site: 37 
10 »> Seems to have a cleavable N-tenti signal seq. 



35 



40 



INTEGRAL 


Likelihood 


= -13 , 


.16 


Transmembrane 


44 


- 60 


( 


39 


- 71) 


INTEGRAL 


Likelihood 


= -10, 


.24 


Transmembrane 


94 


- 110 


{ 


81 


- 114) 


INTEGRAL 


Likelihood 


= -7, 


.64 


Transmembrane 


185 


- 201 


( 


179 


- 207) 


INTEGRAL 


Likelihood 


= -7. 


.48 


Transmembrane 


132 


- 148 


( 


130 


- 158) 


INTEGRAL 


Likelihood 


= -2. 


,76 


Transmembrane 


208 


- 224 


( 


204 


- 225) 


INTEGRAL 


Likelihood 


= -0, 


.06 


Transmembrane 


153 


- 169 


( 


152 


- 169) 



15 

IN" 

: Pinal Results 

bacterial membrane Certainty=0 . 6255 (Affirmative) < suco 

20 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

25 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2607 

A DNA sequence (GASx742) was identified in S.pyogenes <SEQ ID 7713> which encodes the amino acid 
sequence <SEQ ID 7714>. Analysis of this protein sequence reveals the following: 

30 Possible site: 22 



»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -7.80 Transmembrane 887 - 903 ( 882 - 906) 
INTEGRAL Likelihood = -4.88 Transmembrane 6 - 22 ( 5 - 23) 



Final Results -- — 

' bacterial membrane CertaintyssO. 4121 (Affirmative) < suco 

bacterial outside Certainty=0.0000(Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 



LPXTG motif: 877-881 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

45 ,>QP:CAB46409 GB:AL096743 putative large secreted protein 

[Streptorayces coelicolor A3 (2)] 
Identities = 231/599 (38%) , Positives = 329/599 (54%) , Gaps = 43/599 (7%) 

.Query: 278 TSSNSDASSRNIVKIGEIQGASHTSPLLKKAVTVEQVWTYL DDSTHFYVQDLNGDG 334 

50 T +++ +++ V+I ++QG++ SP + VT +VT + S F++QD D 

Sbjct: 28 TPAHAASAAAGPVRIHDVQGSTRLSPYAGEQVTDVAGIVTGVRGYGSSKGFWMQDPLPDA 87 
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Query: 335 DIiATSDGIRVFAKNA-KVQVGDVLTISGEVEEFFGRGYEERKQTDLTITQIVAKAVTK-T 392 

D ATS+G+ VF A +V VGD +T+SG V E+ G Q+ +T+I VT + 
Sbjct: 88 DPATSEGVPVFTSRAPEVAVGDAVTVSGTVSEYVPGGTSSGNQS LTEITRPTVTWS 144 

Query: 393 GTAQVPSPLVLGKDRIAPANIIDNDGLR VFDPEEDAIDYWESMEGMLVAVDDA 445 

G +P+ + + A + DG P A+DY+ES+EGM V V DA 

Sbjct: 145 GGmiPAATWSARSVPRAYAPEGDGAAWGSVNALPLRPGTYALDYYESLEGr#IVRVarffi 204 

Query: 446 KILGPMKN-KEIYVLPGSSTRPIiNNSGGVLLPANSYjrrDVIPVLFKKGKQI IKAGD 500 

+++G E++V PGV+NT++GK GD 

Sbjct: 205 RWGASDPyTELWVTVKPWENPNRRGGTVYGSYDDQNTGRLQIQ-SLGKPADFPAADVGD 263 

Query: 501 SYKGRLAGPVSYS-YGNYKVFVDDSKimPSLMDGHLKPEKTNLQKDLSKLSIASYNIENF 559 

+ G AGP+ Y+Y6Y+ + ++LG+ETQ +L++A+YN+EN 
Sbjct: 264 TLAGTTAGPLDYNQYGGYTLVASE IGALESGGTERESTRRQS-AREIjAVATYISIVENL 319 

Query: 560 SANPSSTKDEKVKRIAESFIHDLNAPDIIGLIEVQDNNGPTDDGTTDATQSAQRLIDAIK 619 

+PS D+ AE+ +H L +PDI+ L E+QDNNG TDDGT A + RLIDAI 

Sbjct: 320 --DPS---DDTFTAHAETIVHRLKSPDIVSLEEIQnNNGATDDGTVaADATVGRIiIDAIV 374 

Query: 620 KIiGGPTYRYVDIAPENNVIX3GQPGGNIRTGFLYQPERVSLSDKPKGGARDA--LTWra 677 

GGP Y + I P + DGGQPGGNIR FL+ PERVS +D+ 6 A A + V G+ 
Sbjct: 375 AAGGPRYDWRGIDPVDKADGGQPGGNIRQAFLENPERVSFTDRAGGDATTATGVRKVRGK 434 

Query: 678 --limiSVGRIDPTNAAWKDWKSIjyffiFIFQGRKVVWANHLNSKRGDNALYGCVQPVTF 735 

L S GR+DP N AW+D RK LA EF+F+GR V VV7\NH NSK GD L QP + 
Sbjct: 435 AALTHSPGRVDPANEAWEDSRKPLAGEFVFRGRTVFWANHENSKGGDQGLTAQYQPPSR 494 

Query: 736 KSEQRRHVIAiqM]liAQFAKE--GAKHQANIVMLGDEraDFEFTKTIQLIE-EGDNIVlTLVSra 792 

SE +RH A ++ F KE A+ A++V LGD NDFEF++T +++E +G + + V 
Sbjct: 495 GSETQRHAQAKVV^^^FVKEIIIAAQK^laDVVaLGDI]®FEFSRTARILEGD^^ 554 

Query: 793 DISDRYSYFHQGNNQTLDNILVSRHLL--DHyEFDMVHVNSPFMEAHGRASDHDPLLLQ 849 

S+RYSY +QGN+Q LD ILVS + H +D VHVN+ F H + SDHDP +L+ 
Sbjct: 555 PRSERYSYVYQGNSQVLDQILVSPSVRRGGHLSYDSVHVNAEF---HDQISDHDPQVLR 610 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useM 
antigens for vaccines or diagnostics. 

Example 2608 

A DNA sequence (GASx743) was identified in S.pyogenes <SEQ ID 7715> which encodes the amino acid 
sequence <SEQ ID 7716>. Analysis of this protein sequence reveals the following: 

Possible site: 22 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=o.2437 (Affirniative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certaintyi=o.OOOO(Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 
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Example 2609 

A DNA sequence (GASx756) was identified in S.pyogenes <SEQ ID 7717> which encodes the amino acid 
sequence <SEQ ID 7718>. Analysis of this protein sequence reveals the following: 

Possible site: 18 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -4.30 Transmembrane 10 - 26 { 8 - 27) 
INTEGRAL Likelihood = -3.08 Transmembrane 51 - 67 ( 50 - 67) 

Pinal Results 

bacterial membrane — Certainty=0. 2720 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2610 

A repeated DNA sequence (GASx758) was identified in S.pyogenes <SEQ ID 7719> which encodes the 
amino acid sequence <SEQ ID 7720>. Analysis of this protein sequence reveals the following: 

Possible site: 22 

>» Seems to have a cleavable N-term signal seq. 

Pinal Results 

bacterial outside Certainty=0 . 3000 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAA38133 GB:X54225 7 kDa protein [Streptococcus pneumoniae] 
Identities = 31/61 (50%), Positives ='41/61 (66%) 

Query: 1 MTNGLKYVLEQMLLLFIIAALACLFLAIGLMIGYSPMGDGQSPWHILSMDKWAELVNKFT 60 

M YV++++LL+ 1+ L L L IGLM+GY +G GQ PW ILS KW EL++KFT 

Sbjct: 3 MNKKSSYVVKRLLLVIIVLILGTLRLGIGLMVGYGILGKXSQDPWaiLSPAKWQELIHCT^ 62 

Qaery: 61 G 61 

G • 
Sbjct: 63 G 63 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2611 

A DNA sequence (GASx764) was identified in S.pyogenes <SEQ ID 7721> which encodes the amino acid 
sequence.<SEQ ID 7722>. Analysis of this protein sequence reveals the following: 

Possible site: 58 

>» Seems to have no N-termlnal signal sequence 
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INTEGRAL Likelihood = -3.98 Transmembrane 47 - 63 ( 46 - 67) 



Final Results 

bacterial membrane Certainty=0. 2593 (Affirmative) < suco 

5 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

A related sequence was also identified in GAS <SEQ ID 9149> which encodes the amino acid sequence 
<SEQ ID 9150>. Analysis of this protein sequence reveals the following: 

10 Possible site: 53 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -3.98 Transmembrane 35 - 51 ( 34 - 55) 

Final Results 

15 bacterial membrane CertaintYi=0. 2593 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

20 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2612 

A DNA sequence (GASx783) was identified in S.pyogenes <SEQ ID 7723> which encodes the amino acid 
25 sequence <SEQ ID 7724>. Analysis of this protein sequence reveals the following: 

Possible site: 43 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood =-13.16 Transmembrane 142 - 158 ( 132 - 167) 

30 INTEGRAL Likelihood =-12.26 Transmembrane 113 - 129 ( 101 - 140) 

INTEGRAL Likelihood =-10.24 Transmembrane 238 - 254 ( 233 - 260) 

INTEGRAL Likelihood = -2.76 Transmembrane . ' 34 - 50 ( 34 - 51) 

1_ Final Results 

35 bacterial membrane — Certainty=0. 6265 (Affirmative) < suco 

bacterial outside — Certainty=0.0000(Not Clear) ■« suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

40 The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAA32091 GB:AB010970 ABC- transporter [Streptococcus mutans] 
Identities = 173/269 (64%) , Positives = 214/269 (79%), Gaps = 2/269 (0%) 

Query: 1 MNFLTKKNRILLREtmCTDFkLRYQGSAIGYLWSILKPLmFTItmiVFiRFLRLGCaiVP 60 
45 M+F ++KNRILL+E++KTDFKLRyQGSAIGYLWSILKPLM+F IMY+VF+RFL LGG+VP 

Sbjct: 1 MDFFSRKNRILLKELIKTDFKLRYQGSAIGYLWSILKPLMLFAIMYIVFVRFLPLGGDVP 60 

Query: 61 HFPVALLLANVIWSFFSEATSMGMVSIVSRGDLLRKLNFSKHIIVFSAVLGALINFLINL 120 
H+PVALLL NVIW+FF E T MGMVS+V+RGDLLRKLNFSK IVFSAV GA INF IN+ 
50 Sbjct: 61 HWPVALLLGNVIWTFFQETTMMGMVSWTRGDLLRKLNFSKQTIVFSAVSGAAINFGINV 120 

Query: 121 WVLIFALINGVTIS--GYAYLSLFLFIELVVLVLGIALLLSNVFVYYRDLAQVWEVLLQ 178 

+WLIFAL+NGVT + +L + LF+EL++ GIA +LS ++V YRD+ VWEV+LQ 

Sbjct: 121 IVVLIFALiaJGVTFTFRWNLFLLIPLFLELLLFSTGIAFILSTLYVRYRDIGPVWEVILQ 180 



55 



Query: 179 AGMYATPIIYPITFVLDSHPLAAKLLMLNPVAQMIQDFRYLLIDRANVTIWQMSTNWPYI 238 
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G Y TPIIY +T++ + AKIjL+L+P+AQ+IQD R++LID ANVTIWQM + 

Sbjct: 181 C3GFYGTPIIYSLTYIATRSWGAKLLLLSPIAQIIQDMRHILIDPANVTIWQMINHKSIA 240 

Query: 239 VIPYLVPFVILFIGIFVFKKNADRFAEII 267 

VIPYLVP + IG VF NA +FAEII 
Sbjct: 241 VIPYLVPIFVFIIGFLVFNYNAKKFAEII 269 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefbl 
antigens for vaccines or diagnostics. 

Example 2613 

A DMA sequence (GASx786) was identified in S.pyogenes <SEQ ID 7725> which encodes the amino acid 
sequence <SEQ ID 7726>. Analysis of this protein sequence reveals the following: 

Possible site: 32 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 3 828 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0.0000(Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAA32094 GB:AB010970 rgpFc [Streptococcus mutans] 
Identities = 381/582 (65%) , Positives = 475/582 (81%) , Gaps = 1/582 (0%) 



Query: 


1 


MiroiliLYVHFNKYNKISaHVYYQLEQMRSLFSKIVFISNSKVSHEDLKRLKiraCLI 


60 






M R+Li:iYVHFNKYN++S+HV YQIi QMRSLFSK++FISNS+V+ D+K L+ LID+F+ 




Sb j ct : 


1 


MKRLLLYVHFNKYNRVSSHWYQLTQMRSLFSKVIFISNSQVADADVKMLREKHLIDDFI 


SO 


Query: 


61 


QRKNKGFDPSAWHDGLIIMGFDKLEEFDSLTIMNDTCFGPIWEMAPYFENFEEKETVDFW 


120 






QR+N 6FDF+AW DG++ +GFD+L +DS+T MNDTCFGP+WEM ++ FE K TVDFW 




Sbjct: 


61 


QRQNSGFDFAAWRDGMVFVGFDELVTYDSVTTMimTCFGPIiWEMYSIYQEPETKTTVDFW 


120 


Query: 


121 


GITNITOGTKAFKEHVQSYFMTFKNQVIQNKVFQQFWQSIIEYENVQEVIQHYETQLTSIL 


180 






G+TNNR TK+F+EH+QSYF++FK V+++ F+ PW++I EY++VQ+VI YET++T+ L 




Sbjct: 


121 


GLraNRATKSFREHIQSYFISFKASVLRSTAFRDPWENIKEYQDVQKVIDQYETKVTTTL 


180 


Query: 


181 


LNEGFSYQWFDTRKAESSFMPHPDFSYYWPTAILKHHVPFIKVKAIDANQHIAPYLIiNL 


240 






L+ GF Y VFDT K ++S M H DFSYYNPTAIL H VPFIKVKAID NQHI PYLIiN 




Sb j ct : 


181 


LDAGFQYDVVFDTTKEDASHMLHADFSYYNPTAIUjmVPFIKVKAIDmQHITPYLIJro 


240 


Query: 


241 


IRETTOTPIDLIVSHMSQISLPDTKYLLSQKYIJICQRIAKQTCQKVAVHLHVFYVDLLDE 


300 






I++ + YPIDLIVSHMS+1+ PD YLL KY+ + QKVAVHLHVFYVDLL+E 




Sbjct: 


241 


iqknstypidlivshmseinypdfsyllghkyvi<:krervdlknqkvavhlhvfyvdllee 


300 


Query: 


301 


FLTAFENWNFHYDLFITTDSDIKRKEIKEILQRKGKTADIRVTGmGRDIYPMLIjLKDK^ 


360 






FLTAF+ ++F YDLFITTDSD K+ EI+EIL G+ A + VTGN GRD+ PML LK+ L 




Sbjct: 


301 


FLTAFKQFHFSYDLFITTDSDDKKaEIEEILSSNGQEAQVFVTGNIGRDVliPMLKLKNYr. 


360 


Query: 


361 


SQYDYIGHFHTKKSKEADFWAGESWRKEIjlDMIjVKPADSILSAFETD-DIGIIIADIPSF 


419 






S YD++GHFHTKKSKEADFWAG+SWR+ELIDMLVKPAD+IL+ + + IG++IAD+P+F 




Sbjct: 


361 


SAYDFVGHFHTKKSKEADPWAGQSWREELIDMLVKPADNILAQLQQNPKIGLVIADMPTF 


420 


Query: 


420 


FRENKIVNAWlSIEHLIAQEMMSLWRK^©VKKQIDFQA^roTPVMSYGTE^ 


479 






FR+NKIV+AWNEHLIA EM +LW+KM + K+IDF A TFVMSYGTFVMPKYDaiiK LFD 




Sbjct: 


421 


FRYNKIVDAWlffiHLIAPENIOTLWQKMGMTKKIDFNaFHTFVMSYGTFVWPK^ 


480 


Queiry: 


480 


LELTQNDIPSEPLPQNSILHAIERLLVYIAWGDSyDFRIVKNPYELTPFIDNKLUILRED 


539 






L LT +D+P EPLPQNSILHAIERLL+YIAW + YDFRI KNP +LTPFIDNKLLN R + 




Sbjct: 


481 


IJSLTDDDVPEEPLPQNSILHAIERLLIYIAlOiEHYDFRISKNPVDLTPFIDNKLIJlffiRGN 


540 
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Query: 540 EGAHTYVNFNQMGGIKGALKYIIVGPAKftMKYIFLRLMEKLK 581 

+T+V+FN MGGIKGiA KYI +GPA+A+Kyi R ++K+K 
Sbjct: 541 SAENTFVDEimiGGIKGaFKyiFIGEaRAVKYILKRSLQKIK 582 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2614 

A DNA sequence (GASx787) was identified in S.pyogenes <SEQ ID 7727> which encodes the amino acid 
sequence <SEQ ID 7728>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

»> Seems to have a cleavable N-term signal seq. 



INTEGRAL 


Likelihood 




■15, 


.66 


Transmembrane 


202 


- 218 


( 


191 


- 224) 


INTEGRAL 


Likelihood 




■10, 


.03 


Transmembrane 


340 


- 356 


( 


335 


- 365) 


INTEGRAL 


Likelihood 




-9, 


.08 


Transmembrane 


270 


- 286 


( 


263 


- 289) 


INTEGRAL 


Likelihood 




-8, 


.60 


Transmembrane 


124 


- 140 


( 


118 


- 145) 


INTEGRAL 


Likelihood 




-4, 


.94 


Transmembrane 


377 


- 393 


( 


375 


- 395) 


INTEGRAL 


Likelihood 




-3, 


.29 


Transmembrane 


291 


- 307 


( 


290 


- 311) 


INTEGRAL 


Likelihood 




-2, 


.87 


Transmembrane 


160 


- 176 


( 


159 


- 180) 


INTEGRAL 


Likelihood 




-2, 


.66 


Transmembrane 


50 


- 66 


( 


48 


- 66) 


INTEGRAL 


Likelihood 




-1, 


.28 


Transmembrane 


77 


- 93 


( 


76 


- 93) 


INTEGRAL 


Likelihood 




-0, 


.69 


Transmembrane 


229 


- 245 


( 


229 


- 245) 



Final Results 

bacterial membrane Certainty!=0. 7262 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAA32095 GB:AB010970 0RF7 [Streptococcus mutans] 
Identities = 374/775 (48%) , Positives = 525/775 (67%) , Gape = 7/775 (0%) 

Query: 53 VSFVGYIISLIGLSYYLSRQVSRQLFLKTSFIVISYLIVSYWVQITQHLNDKRPDIWSLT 112 

V V Y^t-fS^f^fGLS^hYLS^t^ ■!■ + F++ Y■^■^■•^SY•^•^ +T+ m++ P IW L 

Sbjct: 30 VCIiVIYVLSILGLSFYLSKin:.KKrFFIELLLGYGLYIVTSYFLAVTRELNNESFKIWDLA 89 

Query: 113 KNQFYQFQALPSLLIXLV MATLIKILRAYFAIEKDRFGLL-GYQGNTFSVALILAV 167 

KN F+Q 'LP+h++Z+ + + + LL ■!■ F ■(• ■l-l- 

Sbjct: 90 KiraFFQPYFLPTLVLIIACTFALNYLIRVKMKRSHLSRKMTLLLENFSETEFLLTGLIVS 149 

Query: 168 VPINDIHLLKIISSRFSELVTAGNSQIALLKISGLLIVLLVIPATIIYVVLNALKHLKSN 227 

++D +KL+ + -fLL ■(■ LL L^h-fF-f I-f NA -!■ -fK N 

Sbjct: 150 FILSDTLYVKLLQESLRAYYHKPLAYESLLFLYTLLT--LILFSVIVEACFNAYRSIKLN 207 

Query: 228 KPSFSVAATTSLFLALVFNYTFQYGVKGDEALLGYYVFPGATLFQIVAITLVALLAYVIT 287 

•^P■^ S■^A -t-SL A -fFNY FQYG•^K D LLG Y+ PGAT +QI+ ■^T Y+Z 
Sbjct: 208 RPNLSLAFVSSLLFATIFNYAFQYGLKNDftDLLGKYIVPGATAYQILVLTAAGFFLYLII 267 

Query: 288 NRYWPTTFFLLILGTIISWNDLKESMRSEPLLVTDFVWLQELGLVTSFVKKSVIVEMW 347 

NRY TF ■^■HLG^fll-l^WN LK MR^fEPLLVTDF W^l^ L^f V ++1 ++ 
Sbjct: 268 NRYLLVTFLIVILGSIITTOmiKVGMRlffiPLLVTDFAWVTNIRLLARSVKaNIIFSTLL 327 

Query: 348 GLAICIWAWYLHGRVLAGKLFMSPVKRASAVLGLFIVSCSMLIPFSYEKEGKILSGLPI 407 

LA 1++ -fL R•^L GK+ ■f ■v ■f S-fr I F EK KI^l-fG^fP^f 

Sbjct: 328 ILAACILLYLFLRKRLLQGKITENYRLKVGLISSICLLGFSIFIIFRNEKGSKIVNGIPV 387 

Query: 408 ISALNNDNDINWLGFSTNARYKSLAYVm'RQVTKKIMEICPTNYSQETIASIAQKYQKLAE 467 

IS ■fNN DI + GF -hNA YKSL YVWT-^QVTK IM-^KP■^■^■YS■^-E I ■vA-fKY -1-A 
Sbjct: 388 ISQVNNVraJIGYQGFYSNASYKSIMYVOTKQVTKSimKPSDYSKERILKLAKKYWl^^ 447 
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Query: 468 DINKDRKNNIADQTVIYLLSESLSDPDRVSNVTVSHDVLPNIKAIKNSTTAGLMQSDSYG 527 

INK R NI++QTVIY+LSES SDPDRV V +S DV+ENIK IK TT+GLM SD YG 
Sbjct: 448 KINKVRTENISNQTVIYILSESFSDPDRVQGVNLSRDVIENIKQIKEKTTSGLMHSDGYG 507 

5 

Query: 528 GGTAIWEFQTLTSLPFYOTSSSVSVLySEVPPKMAKPHTISEFYQGKSKIAJfflPASaiilN 587 

GGTANMEFQ+LT LP+YNF+SSVS LY+EV P M+ +IS ++ KNR+ +HP+SA+N+ 
Sbjct: 508 GGTMqMEFQSLTGLPYYNFNSSVSTLYTEVVPDMSVFPSlSNQFKSKNRVVIHPSSASNY 567 

10 Query: 588 NRKTVYSNLGFSKFLALSGSKDKFKNIENVGLLTSDKTVYNNILSLINPSESQFFSVITM 647 

+RK VY L F F+A SG+ DK + E VGL SDKT Y NIL INPS+SQFFSV+TM 
Sbjct: 568 SRKYVYDKLKFPTFVASSGTSDKITHSEKVGLNVSDKTTYQNILDKINPSQSQFFSVMTM 627 

Query: 648 QJraiPWSSDYPEEIVAEGKNFTEEENHlTLTSYiUlLLSFTDKETRAFLEKLTQIN^ 707 
15 QNH+PW+SD P ++VA GK +T++EN +L+SYARLL++TDKET+ FL +L+Q+ +TW 

Sbjct: 628 QiraVPWASDEPSrmaTGKGYTKDENGSLSSYJUlLLTYTDKETKDFLaQLSQIiKHK\^^ 687 

Query: 708 FYGDHLPGLYPDSAFNKHIENKYLTDYFIWSNGTNEKKNHPLINSSDFTTyULFEHTDSKV 767 
FYGDHLPGLYP+SAF K +++Y TDYFIWSN + NH +NSSDFTA L EHT+SKV 
20 Sbjct: 688 FYGDHLPGLYPESAFKKDPDSQYQTDYFIWSNYMTKTLNHSYVNSSDFTAELLEHTNSKV 747 

Query: 768 SPYYAIJijTBVIjNKASVDKSPDSPEVKAIQNroLKNIQyDVTIGKGYLI^ 822 

SPYYALLTEVL+ +V + E K I NDLK IQYD+T+GKGY+ +K FP I 

Sbjct: 748 SPYYALLTEVLDNTTVGHGKLTKEQKEIANDLKLIQYDITVGKGYIRNYKGFFDI 802 

25 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2615 

A DNA sequence (GASx789R) was identified in S.pyogenes <SEQ ID 7729> which encodes the amino acid 
30 sequence <SEQ ID 7730>. Analysis of this protein sequence reveals the foUowing: 

Possible site: 13 



35 



40 



»> Seems to have no N- terminal signal sequence 

INTEGRAL Likelihood = -1.06 Transmembrane 42 - 58 ( 42 - 58) 



Final Results 

bacterial membrane — Certainty=0. 1426 (Affirmative) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

45 Example 2616 

A DNA sequence (GASx790) was identified in S.pyogenes <SEQ ID 773 1> which encodes the amino acid 
sequence <SEQ ID 7732>. Analysis of this protein sequence reveals the following: 

Possible site: 24 

50 »> Seems to have a cleavable N-term signal seq. 

Final Results 

bacterial outside Certainty=0 .3000 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

55 bacterial cytoplasm Certainty=0.0000(Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

5 Example 2617 

A DNA sequence (GASx791) was identified in S.pyogenes <SEQ ID 7733> which encodes the amino acid 
sequence <SEQ ID 7734>. Analysis of this protein sequence reveals the following: 

Possible site: 48 

10 >>> Seems to have a cleavable N-term signal seq. 



INTEGRAL 


Likelihood 




•12. 


.42 


Transmembrane 


166 


- 182 


( 


157 


- 188) 


INTEGRAL 


Likelihood 




-7, 


.32 


Transmembrane 


85 


- 101 


( 


79 


- 104) 


INTEGRAL 


Likelihood 




-6, 


.90 


Transmembrane 


397 


- 413 


( 


386 


- 417) 


INTEGRAL 


Likelihood 




-6. 


.05 


Transmembrane 


253 


- 269 


( 


252 


- 273) 


INTEGRAL 


Likelihood 




-5, 


.26 


Transmembrane 


301 


- 317 


( 


293 


- 325) 


INTEGRAL 


Likelihood 




-3, 


.35 


Transmembrane 


363 


- 379 


( 


362 


- 379) 


INTEGRAL 


Likelihood 




-3, 


.24 


Transmenibrane 


335 


- 351 


( 


335 


- 351) 



Final Results 

20 bacterial membrane Certainty=0.5967(Affiimative) < suco 

bacterial outside — Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 
25 The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAA64645 GB:U10927 CapF [Staphylococcus aureus] 
Identities = 97/419 (23%) , Positives = 186/419 (44%) , Gaps = 40/419 (9%) 

Query: 12 FLWNMLGSLSTAVISVILLMVVTRLLTSADSDIYAFAYSFANMMVVVGLFQVRNyQATDI 71 
30 F + + ++ +A+ ++L+V+ RL T D Y +A + + ++R+ T 

Sbjct: 5 FNYMFVANILSALCKFLILLVIVRLGTPEDVGRYNYALVITAPIFLFISLKIRSVIVT-- 62 

Query: 72 NEKYSFSQYLVARLMTCLLMLAITVIYLTLTKTDSYKSTIVFLVCFYRSTDAFSDLYQGM 131 
N+KYS ++Y+ A L ++ L I++ + T + +V + + ++ G+ 

35 Sbjct: 63 NDKYSPNEYISAILSLNIITLIFVAIFVYVLGNGDL--TTILIVSLIKLFE2IIKEVPY6I 120 

Query: 132 FQQHERLDlAGKSLAYRNTLIFMVYTAIILYSKtttiTLALVAVCIVSLVFIMYYDIGHSKK 191 

+Q++E L + G S+ N L +++ I +S NL +AL+ + I + D + K 

Sbjct: 121 YQKlffiSLKLLGISMGIYNILSLILFYIIYSFSHNIiNMaLLFLVISCIFSFAIIDRWyLSK 180 

40 

Query: 192 FQKLMFSELLSNISFQNSLKLLKESF PLFLNGFLIIYIYTOPKYAIELMTTLGEVA 247 

+ + + + N++ KE F PL + L P+ +E + G+ 

Sbjct: 181 YYNI KLHYNNNIAKFKEIFILTIPLAFSSALGSLNTGIPRIVLENL- -PGKYT 231 

45 Query: 248 LGS-QTIENILFMPAFVMNLLILFFRPHITQMAIALIRGQIK-EFNKIQVQLFAYLGVF- 304 
LG TI +L + N + F P + + L + + K EF K+ ++ ++G+F 
Sbjct: 232 LGIFSTIAYVLVIGGLFANSISQVFLPKLRK LYKDEKKIEFEKLTRKM-VFIGIFI 286 

Query: 305 SLIALVGSGLFGIPFLSILYG TNLTDYWVDF-MLIMLGGSIGSFATVIDNILTAM 358 

50 + +++ S G LS+L+G N+ + F +L +L G + 

Sbjct: 287 GMCSVILSLFLGEALLSLLFGKEYGENNIILIILSFGLLPILSGIFLGTTIIATGKXNVN 346 

Query: 359 RKQQLLLIPYTGGPLISLLITNLFVMKYHIIJ3AALSFLIT^tt.VWLGLSIMIYLFIMNRF 417 
K L+L+ P I L+ + L + KY +LGAAL+ 1+ V L I Y F F 

55 Sbjct: 347 YKISLILL PCI-L1FSFLLIPKYSLLGAALTITISQFVAL---ISYYYFYKRIF 396 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2618 

A DNA sequence (GASx792) was identified in S.pyogenes <SEQ ID 7735> which encodes the amino acid 
sequence <SEQ ID 113>6>. Analysis of this protein sequence reveals the following: 

Possible site: 36 

>» Seems to have no N-terminal signal sequence 



INTEGRAL 


Likelihood 




10 


03 


Transmembrane 


64 




80 


( 60 




84) 


INTEGRAL 


Likelihood 




-9 


66 


Transmembrane 


43 




59 


( 37 




63) 


INTEGRAL 


Likelihood 




-8 


70 


Transmembrane 


232 




248 


( 229 




251) 


INTEGRAL 


Likelihood 




-8 


28 


Transmembrane 


410 




426 


( 402 




432) 


INTEGRAL 


Likelihood 




-6 


21 


Transmembrane 


298 




314 


( 296 




322) 


INTEGRAL 


Likelihood 




-6 


21 


Transmembrane 


478 




494 


( 471 




496) 


INTEGRAL 


Likelihood 




-5 


04 


Transmembrane 


265 




281 


( 256 




288) 


INTEGRAL 


Likelihood 




-3 


29 


Transmembrane 


380 




396 


( 378 




397) 


INTEGRAL 


Likelihood 




-2 


92 


Transmembrane 


210 




226 


( 209 




227) 


INTEGRAL 


Likelihood 




-2 


60 


Transmembrane 


187 




203 


( 187 




204) 


INTEGRAL 


Likelihood 




-2 


50 


Transmembrane 


442 




458 


: 439 




458) 


INTEGRAL 


Likelihood 




-1 


65 


Transmembrane 


18 




34 


( 18 




35) 


INTEGRAL 


Likelihood 




-1.38 


Transmembrane 


165 




181 


( 165 




181) 



Final Results 

bacterial membrane Certainty=0 . 5012 (Affirmative) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:BAA19642 GB:AB002668 unnamed protein product [Actindbacillus 
actinomycetemcomitans] 
Identities = 116/459 (25%) , Positives = 207/459 (44%) , Gaps = 60/459 (13%) 



Query: 


69 


FILVFGTISAIISPINDIPDEYVHYSRTVYISEGDINLTNKINKKLRISKDVDKLI 


123 






FIL F I II+P PDE+ H+ R IS G I ++ K + K + K++ 




Sbj ct: 


16 


FILTF- IIGVIITPPYQSPDEFYHFQRGYAISNGQIIPSSTEK- - -LDKAMMKMLSIYE6 


71 


Query: 


124 


KQSGKTFITSNLKATKHSTREYSYPYIKGTNAYYSFSYIPQALGILVGNALDLPIL 


179 






++ T N +EY TN Y+ Y+PQALG +G+ LDL + 




Sbjct: 


72 


IPYRSENKVTHFLENEAQNVAWEKEYILDESANTNVYFPLIYLPQALGSFLGSTLDLSLY 


131 


Query: 


180 


LTYYFGRLCN-LISYAMLAFIAIKLSGSFKQVIAWTLLPMNIYLAASFNQDGFAIGLVL 


238 






YY ++ L+S A+L F +++ S + ++ LPM ++ S N D ++ 




Sbjct: 


132 


NMYYI1AKIFTLLVSIAILYFASVQYRLSIP--VLLILSLPMTMFQMGSTNPDS II 


184 


Query: 


239 


VTIGLFI-NLLSSKDKSNYNTKFFLYLVLCGLL VLSKFTYFLLVCLPLFIENEK 


291 






++ +FI +LL+ SNYN F + C LL V KF +L+ LP FI + 




Sbj ct : 


185 


FSLSVFIGSLLARGLDSNYN- - - FTHKDFCKLLPSIFLCVTVKFNMLVLLLLPFFISKRR 


241 


Query: 


292 


FGKNTKLVILKKLGGLLLIFLFAAMWFRLYGQVKTPYVADFLKEV NVSQQVKNMLE 


347 






++ + + ++L+A K++F+ +++KNL 




Sbjct: 


242 


EIRHGSMYSIFIIILSILWIVXJy^KLTEAQSHFKEGALHNFSYYIFHMDDLFEIFKNTLN 


301 


Query: 


348 


SPIVYSSIIIRHMVINLINMNNIFQFGA-LSYGITNLFPLYVCFFFFVYISNASKITINI 


406 






+ Y ++R + L ++ F L +G T+L + F++I N K+ I 




Sbjct: 


302 


--LTYLKSLLRMFLGVLGWVDTKFTINEYLFFGSTSLLA YIFLFIHNLYKLKYVI 


354 


Query: 


407 


VEKM--GIIFVISAI1GATVLAMYLTWTPVGSSTVLGVQSRYLIGIIPLVLLLFSS 


460 






V + G++F+ +1 + +T+ +G++ ++GVQ RY IP++L++FSS 




Sbjct: 


355 


VSVLLVGWFLFTHFI LLITVNEIGTTQIVGVQGRY- - -FIPIlttllFSSFILK 


405 
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Query: 461 QQQKFKQIEDILSDKIiAIHVSLLFmiMLM--STIFRYy 497 

+ +K +1 + + LFI + + + + RYY 
Sbjct: 406 KSEKTSNNKTISKyPIIVPFIiFLFISSFITINTLVSRYY 444 

5 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2619 

A DNA sequence (GASx797) was identified in S.pyogenes <SEQ ID 7737> which encodes the amino acid 
sequence <SEQ ID 7738>. Analysis of this protein sequence reveals the following: 

10 Possible site: 49 

>>> Seems to have no N-terminal signal sequence 

Final Results 

15 bacterial cytoplasm Certainty=0 . 1491 (Affirmative) < suco 

bacterial tnembrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
20 The protein has homology with the following sequences in the GENPEPT database: 

>GP:J^C83961 GB:L47648 cytidine monophosphate kinase [Bacillus subtilis] 
Identities = 116/220 (52%), Positives = 156/220 (70%), Gaps = 1/220 (0%) 

25 



30 



35 



40 



Query: 


2 


KAIKIAIDGPASSGKSTVRKIIAKNLGYTYLDTGftMXRSATYIALTHGYTGKEVALILEE 


61 






K + IAIDGPA++GKSTVAKI+A+ Y Y+DTGAMYRH- TY AL + + E 




Sbjct: 


3 


KKLSmiDGPAAAGKSTVAKIVAEKKSYIYIDTGarraiAITYAaLQENVDLTDEEKLAEL 


62 


Query: 


62 


LEKNPIFFKKAKDGSQLVFLGDEDVTLAIRQNDVTNNVSWISALPEIREELVHQQRRIAQ 


121 






L++ I KDG Q VF+ DVT AIR ++++N VS + +REE+V +Q+++ + 




Sbjct: 


63 


LKRTOIELITTKDG-QKVFVrraTDVTEAIRTDEISNQVSIAAKHRSVREEMVKRQQQLGE 


121 


Query: 


122 


AGGIIMDGRDIGTWLPDAELKIFLVASVEERAERRYKENLEKGIESDFETIiKEEIAARD 


181 






GG++MDGRDIGT VLP+AE+KIFL+ASVEERA+RRY+EN++KG + ++ETL EEIA RD 




Sb j ct : 


122 


KGGVVMDGRDIGTHVLE^IAEVKIFLLASVEERAKRRYEENVKKGFDVNYETLIEEIARRD 


181 


Query: 


182 


YKDSHRKVSPLKAAEDALIFDTTGVSIDGWQFIQEKAEK 221 








DS R+VSPL+ AEDAL DTT +SI V I E E+ 




Sb j ct : 


182 


KLDSEREVSPLRKAEDALEIDTTSLSIQEVADKILEAVEQ 221 




Based on 


this 


analysis, it was predicted that this GAS-specific protein and its 


epitopes, could be usefiil 



antigens for vaccines or diagnostics. 



Example 2620 

A DNA sequence (GASx799) was identified in S.pyogenes <SEQ ID 7739> which encodes the amino acid 
sequence <SEQ ID 7740>. Analysis of this protein sequence reveals the following: 
45 Possible site: 29 

»> Seems to have no N-terminal signal sequence 

Final Results 

50 bacterial cytoplasm — Certainty=0. 4324 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 



wo 02/34771 



-2730- 



PCT/GBOl/04789 



The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAA34313 GB:X16188 ribosomal protein L35 (AA 1-66) [Bacillus 
stearothermophilus] 
Identities = 46/65 (70%) , Positives = 51/65 (77%) 

5 

Query: 1 MPKQKTHRASAKRFKRTGSGGLKRFRAPTSHRFHGKTKKQRRHLRKAGLVSSGDPKRIKA 60 

MPK KTHR SAKRPK+T SG LKR A+TSH F KTKKQ+RHLRKA LVS GDPKRI+ 
Sbjct: 1 MPKMKIimGSAKRFKKTASGKLKRGHAYTSHLFANKTKKQKRHLRK^ 60 

10 Query: 61 MVTGL 65 

M+ L 
Sbjct: 61 MLntJL 65 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
15 antigens for vaccines or diagnostics. 

Example 2621 

A DNA sequence (GASx806R) was identified in S.pyogenes <SEQ ID 7741> which encodes the amino acid 
sequence <SEQ ED 7742>. Analysis of this protein sequence reveals the following: 

Possible site: 16 

20 

»> Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 5361 (Affirmative) < suoo 

25 bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certaintyi= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

30 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2622 

A DNA sequence (GASx809R) was identified in S.pyogenes <SEQ ID 7743> which encodes the amino acid 
sequence <SEQ ID 7744>. Analysis of this protein sequence reveals the following: 

35 Possible site: 52 

»> Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -8.81 Transmembrane 33 - 49 ( 28 - 53) 

40 Final Results 

bacterial membrane Certainty=0 .4524 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

45 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 
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Example 2623 

A DNA sequence (GASx814R) was identified in S. pyogenes <SEQ ID 7745> which encodes the amino acid 
sequence <SEQ ID 7746>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

5 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 0206 (Affirmative) < suco 

10 bacterial membrane Certainty=0 . 0000 {Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

15 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2624 

A DNA sequence (GASx817) was identified in S.pyogenes <SEQ ID 7747> which encodes the amino acid 
sequence <SEQ ID 7748>. Analysis of this protein sequence reveals the following: 

20 Possible site: 13 

»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -1.49 Transmembrane 16 - 32 ( 15 - 32) 

25 Final Results 

bacterial membrane — CertaintysO. 1595 (Affirmative) < suco 

bacterial outside Certainty=0.0000 (Not Clear) < succ> 

bacterial cytoplasm — Certainty*: 0.0000 (Not Clear) < suco 

30 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2625 

35 A DNA sequence (GASx820) was identified in S.pyogenes <SEQ ID 7749> which encodes the amino acid 
sequence <SEQ ID 7750>. Analysis of this protein sequence reveals the following: 
Possible site: 31 

»> Seems to have an uncleavable N-term signal seq 
40 INTEGRAL! Likelihood = -7.11 Transmembrane 62 - 78 ( 59 - 81) 

INTEGRAL Likelihood = -6.00 Transmembrane 128 - 144 ( 123 - 147) 
INTEGRAL Likelihood = -2.50 Transmembrane 5 - 21 ( 3 - 26) 

Final Results 

45 bacterial membrane — Certainty=0. 3845 (Affirmative) < suco 

bacterial outside — Certaintyi=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0.0000(Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has homology with the following sequences in the GENPEPT database: 



>6P:AAA26653 GB:M83994 prolipoprotein signal peptidase 
[Staphylococcus aureus] 
Identities = 57/153 (37%) , Positives = 96/153 (62%) , Gaps = 6/153 (3%) 



Query: 



Sbjct: 



1 



1 



MKKRLFVLSLILL VRLDQLSKFWIVSHIALGEVKPFIPGIVSLTYLQNNGAAFSIIi 56 

M K+ F+ + IL+ V DQ++K+ I + + +G+ IP +++T +NNGAA+ IL 
mKKYFIGTSILIAVFWIFDQVTKYIIATTtmGDSFEVIPHFIiNITSHPMl^^ 60 



Query: 



Sb j ct : 



57 QDQQWFFWITVLVIGYAIYYLATHPHIiNIWKQLALLLIISGGIGNFIDRLRLAYVIDMI 116 

+ FF +IT++++ +Y+ N++ Q+A+ L+ +G +GNFIDR+ V+D I 

61 SGKMTFFFIITIIILIJiLVYFPIKimQYNLPMQVAISLLPAGSU^COTIDRILTGEVVDFI 120 



Query: 



117 HLDF--VDFAIFlTOftDSYLTVGVILLLICLWKE 147 

+ DF IFN+ADS LT+GVII:j++I L K+ 



iSbjct: 121 DTNIFGYDFPIFNIADSSLTIGVILIIIALLKD 153 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, coxild be useful 
antigens for vaccines or diagnostics. 

Example 2626 

A DNA sequence (GASx822R) was identified in S.pyogenes <SEQ ID 775 1> which encodes the amino acid 
sequence <SEQ ID 7752>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

»> Seems to have no N-terminal signal sequence 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2627 

A DNA sequence (GASx823R) was identified in S.pyogenes <SEQ ID 7753> which encodes the amino acid 
sequence <SEQ ID 7754>. Analysis of this protein sequence reveals the following: 

Possible site: 45 

>» Seems to have no N-terminal signal sequence 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useftil 
antigens for vaccines or diagnostics. 



Final Results 



bacterial cytoplasm — Certainty=0. 2638 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty5=0. 0000 (Not Clear) < suco 



Final Results 



bacterial cytoplasm — Certainty=0. 3452 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco> 

bacterial outside Certainty^O . 0000 (Not Clear) < suco 



wo 02/34771 



-2733- 



PCT/GBOl/04789 



Example 2628 

A DNA sequence (GASx828) was identified in S.pyogenes <SEQ ID 7755> which encodes the amino acid 
sequence <SEQ ID 7756>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

5 

>» Seems to have an imcleavable N-term signal seg 

Final Results 

bacterial membrane — Certaintyi=0.0000(Not Clear) < suco 

10 bacterial outside CertaintY=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

15 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2629 

A DNA sequence (GASx836) was identified in S.pyogenes <SEQ ID 7757> which encodes the amino acid 
sequence <SEQ ID 7758>. Analysis of this protein sequence reveals the following: 

20 Possible site: 18 

>» Seems to have no N-terminal signal sequence 

Final Results 

25 bacterial cytoplasm Certainty=0.4333 (Affiinnative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty4=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

30 The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2630 

A DNA sequence (GASx853R) was identified in S.pyogenes <SEQ ID 7759> which encodes the amino acid 
35 sequence <SEQ ID 7760>. Analysis of this protein sequence reveals the following: 

Possible site: 14 

>» Seems to have no N-terminal signal sequence 

40 Final Results 

bacterial cytoplasm Certainty=0 .4906 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 



45 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2631 

A DNA sequence (GASx854R) was identified in S.pyogenes <SEQ ID 7761> which encodes the amino acid 
5 sequence <SEQ ID 7762>. Analysis of this protein sequence reveals the following: 

Possible site: 43 

»> Seems to have no N-terminal signal sequence 

10 Final Results 

bacterial cytoplasm — Certainty=0. 3989 (Affirmative) < suco 
bacterial membrane — Certainty=0 . 0000 (Not Clear) < suco 
bacterial outside — Certainty=0 . 0000 (Not Clear) < suco 

15 A related sequence was also identified in GAS <SEQ ID 9147> which encodes the amino acid sequence 
<SEQ ID 9148>. Analysis of this protein sequence reveals the following: 

Possible site: 42 

»> Seems to have no N-terminal signal sequence 

20 

Pinal Results 

bacterial cytoplasm — Certainty= 0.399 (Affirmative) < suco 
bacterial membrane — Certainty= 0.0 00 (Not Clear) < suco 
bacterial outside — Certainty= 0.000 (Not Clear) < suco 

25 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAB59092 GB:M97a57 pyrogenic exotoxin C [Streptococcus pyogenes] 
Identities = 39/67 (58%) , Positives = 53/67 (78%) 

30 

Query: 1 LMESKEIYLTKSPYIRGSLEIHSKNRKHEKINLYDAKPNSTRSDVFKKYKDNKTINMKDF 50 

LM++ +IY SPY+ G +EI +K+ KHE+I+L+D+ TRSD+F KYKDN+ INMK+F 
Sbjct: 167 LMDNYKIYDATSPYVSGRIEIGTKDGKHEQIDLFDSPNEGTRSDIFAKYKDNRIINMKNF 226 

35 Query: 61 SHFDIYL 67 

SHFDIYL 
Sbjct: 227 SHEDIYL 233 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
40 antigens for vaccines or diagnostics. 

Example 2632 

A DNA sequence (GASx855R) was identified in S.pyogenes <SEQ ID 7763> which encodes the amino acid 
sequence <SEQ ID 7764>. Analysis of this protein sequence reveals the following: 

Possible site: 33 

45 

>» Seems to have a cleavable N-term signal seq. 

Pinal Results 

bacterial outside Certainty=0.3000 (Aff imnative) < suco 

50 bacterial membrane — Certainty!=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2633 

5 A DNA sequence (GASx856) was identified in S.pyogenes <SEQ ID 7765> which encodes the amino acid 
sequence <SEQ ID 7766>. Analysis of this protein sequence reveals the following: 

Possible site: 26 

»> Seems to have no N-terminal signal sequence 

10 

, Final Results 

bacterial cytoplasm Certainty=0 .4145 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

15 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

20 Example 2634 

A DNA sequence (GASx862) was identified in S.pyogenes <SEQ ID 7767> which encodes the amino acid 
sequence <SEQ ID 7768>. Analysis of this protein sequence reveals the following: 

Possible site: 19 

25 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 6285 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

30 bacterial outside Certainty=0 . 0000 (Not Clear) < succ> 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology Avith any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
35 antigens for vaccines or diagnostics. 

Example 2635 

A DNA sequence (GASx863) was identified in S.pyogenes <SEQ ID 7769> which encodes the amino acid 
sequence <SEQ ID 111Q>. Analysis of this protein sequence reveals the following: 
Possible site: 51 

40 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial membrane Certainty^O . 0000 (Not Clear) < suco 

45 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

5 Example 2636 

A DNA sequence (GASx878) was identified in S.pyogenes <SEQ ID 7771> which encodes the amino acid 
sequence <SEQ ID 7772>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

10 »> Seems to have a cleavable N-term signal seq. 

Final Results 

bacterial outside — Certainty=0. 3000 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

15 bacterial cytoplasm — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
20 antigens for vaccines or diagnostics. 

Example 2637 

A DNA sequence (GASx887R) was identified in S.pyogenes <SEQ ID 7773> which encodes the amino acid 
sequence <SEQ ID 111A>. Analysis of this protein sequence reveals the following: 

Possible site: 20 

25 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 1911 (Affirmative) < suco 

30 bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

35 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2638 

A DNA sequence (GASx910) was identified in S.pyogenes <SEQ ID 7775> which encodes the amino acid 
sequence <SEQ ID 7776>. Analysis of this protein sequence reveals the following: 

40 Possible site: 20 

»> Seems to have no N-terminal signal sequence 

Final Results 

45 bacterial cytoplasm Certainty=0 .4511 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 
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bacterial outside Certaiiity=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

5 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2639 

A DNA sequence (GASx911) was identified in S.pyogenes <SEQ ID 7777> which encodes the amino acid 
sequence <SEQ ID 111%>. Analysis of this protein sequence reveals the following: 

10 Possible site: 52 

»> Seems to have no N-terminal signal sequence 

Pinal Results 

15 bacterial cytoplasm Certainty=0 .2993 (Affirmative) < suco 

bacterial membrane Certainty=0 . OOOO (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
20 The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAC74707 GB:AE000259 glutathionine S-transf erase [Escherichia 

coli] 

Identities = 29/137 (21%) , Positives = 61/137 (44%) , Gaps = 9/137 (6%) 

25 Query: 1 LPFIAKQTLKSQLIPQDNLLAESRFNEIMDFLTGDFPLVFRPMINPHRYTISQDNQALEK 60 
+ ++A QL+ N ++ + E ++++ + F P+ P E+ 
Sbjct: 70 MQYLADSVPDRQLLAPVNSISRYKTIEWIiNYIATELHKGFTPLFRP-- --DTPEE 120 

Query: 61 VKQASYKRMDIAMTHLDSLIGESGHVYRDQQTIADAYAYAMALWSQKTPKSYENYPHLaA 120 
30 K +++ + +++ + + + + TIADAY + + W+ + E H+AA 

Sbjct: 121 YKPTVRAQLEKKI^SYVNEALKDEHWICGQRFTIAmYLFTVLRWAYAVKI^ 180 

Query: 121 FMAKMVEDSAVQQVLNA 137 
FM +M E VQ L+A 
35 Sbjct: 181 FMQRMAERPEVQDALSA 197 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2640 

40 A DNA sequence (GASx932R) was identified in S.pyogenes <SEQ ID 7779> which encodes the amino acid 
sequence <SEQ ID 7780>. Analysis of this protein sequence reveals the following: 

Possible site: 14 

»> Seems to have no N-terminal signal sequence 

45 



50 



Final Results 

bacterial cytoplasm Certainty=0. 4081 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2641 

5 A DNA sequence (GASx935) was identified in S.pyogenes <SEQ ID 7781> which encodes the amino acid 
sequence <SEQ ID 7782>. Analysis of this protein sequence reveals the following: 

Possible site: 45 

»> Seems to have no N-terminal signal sequence 

10 

Final Results 

bacterial cytoplasm Certainty=0 . 6304 (Affirmative) < suco 

bacterial membrane Certainty= 0,0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

15 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

20 Example 2642 

A DNA sequence (GASx937) was identified in S.pyogenes <SEQ ID 7783> which encodes the amino acid 
sequence <SEQ ID 7784>. Analysis of this protein sequence reveals the following: 
Possible site: 34 
25 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certaintyi=0. 3503 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

30 bacterial outside Certaintyi=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
35 antigens for vaccines or diagnostics. 

Example 2643 

A DNA sequence (GASx938R) was identified in S.pyogenes <SEQ ID 7785> which encodes the amino acid 
sequence <SEQ ID 7786>. Analysis of this protein sequence reveals the following: 

Possible site: 27 

40 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2884 (Affirmative) < suco 

45 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified, in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

5 Example 2644 

A DNA sequence (GASx939) was identified in S.pyogenes <SEQ ID 7787> which encodes the amino acid 
sequence <SEQ ID 7788>. Analysis of this protein sequence reveals the following: 

Possible site: 50 

10 »> Seems to have no N- terminal signal sequence 

Final Results 

bacterial cytqplastn Certaintyi=0. 2771 (Affirmative) < suco 

bacterial membrane CertaintY=0 . 0000 (Not Clear) < suco 

15 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useftil 
20 antigens for vaccines or diagnostics. 

Example 2645 

A DNA sequence (GASx941) was identified in S.pyogenes <SEQ ID 7789> which encodes the amino acid 
sequence <SEQ ID 7790>. Analysis of this protein sequence reveals the following: 

Possible site: 29 

25 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasra Certainty=0. 2257 (Affirmative) < suco 

30 bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

35 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2646 

A DNA sequence (GASx942R) was identified in S.pyogenes <SEQ ID 7791> which encodes the amino acid 
sequence <SEQ ID 7792>. Analysis of this protein sequence reveals the following: 

40 Possible site: 23 

»> Seems to have no N-terminal signal sequence 

Final Results 

45 bacterial cytoplasm Certainty=0 . 3255 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Wot Clear) < suco 
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bacterial outside Certainty^O. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAB91582 GB:AF242881 ymh. [Agrobacterium tumefaciens] (ver 2) 
Identities = 75/223 (33%) , Positives = 116/223 (51%) , Gaps = 2/223 (0%) 



Query: 


38 


DQNSGENKHKRVHNLVSDILNRTQNTnNIKLVIEYVCNPLRYIlJEVSIEEQIiRTAINIPL 97 






D + K R++N + N + +1 I P R+ + FE +R +N L 


Sbjct: 


39 


DTDPQMTKRHRLraAFASDQNSRKQRTHIIAFIRKAMKPERPARDSERFEPMRIilS^^ 98 


Query: 


98 


SLKGLIVSDSGQIVTTTTSKTLSEAKKRFETLDSRLKELKVHPHVLKFCTQELLQENYFH 157 






+ Gil V SG++ ++TLS+A +R L + L VHP VL+FC +EIjIi +NYFH 


Sbjct: 


99 


AFAGLAVKASGELBAVnftAETIiSOaTRRALELRADLTSRGVHPDVLRFCREELIiVDljyFH 158 


Query: 


158 


AVFEASKSVFHRIRLLTGSAMDSASLIDQCFKPGEPIVI INGNKLQTLDEQSEYKGLKNL 217 






AV EA K V +IR TG DA L+D+ F P++ I N+LQ+ E+ E +G NL 


Sb j ct : 


159 


AVLEAVKSVADKIRQRTGLTDDGAVLVDRAFSGDAPMLAI- -NELQSESEKGEQRGFSNL 216 


Query: 


218 


LLAIAHLYRNSKAHKLKYYNPDNLiroALTALTLMSLAHNLIJDS 260 






+ ++RN+ AH + + + DA ++ SL H +D+ 


Sb j ct : 


217 


VKGTFSMFRNTTAHAPRIHWQMSKEDAEDLFSMFSLMHRRIDA 259 


Based on this 


analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 



antigens for vaccines or diagnostics. 



Example 2647 

A DNA sequence (GASx943R) was identified in S.pyogenes <SEQ ID 7793> which encodes the amino acid 
sequence <SEQ ID 7794>. Analysis of this protein sequence reveals the following: 

Possible site: 30 

>» Seems to have no N-terminal signal sequence 

Pinal Results 

bacterial cytoplasm — Certainty=0. 1526 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2648 

A DNA sequence (GASx944) was identified in S.pyogenes <SEQ ID 7795> which encodes the amino acid 
sequence <SEQ ID 1196>. Analysis of this protein sequence reveals the following: 
Possible site: 19 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty^O . 1427 (Affirmative) < suco 

bacterial membrane Certainty=0.0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 
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No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

5 Example 2649 

A DNA sequence (GASx945) was identified in S.pyogenes <SEQ ID 7797> which encodes the amino acid 
sequence <SEQ ID 7798>. Analysis of this protein sequence reveals the following: 

Possible site: 13 

10 >>> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 2578 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

15 bacterial outside — Certainty= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAC98430 GB:L29324 excisionase [Streptococcus pneumoniae] 
20 Identities = 23/54 (42%) , Positives = 40/54 (73%) 

Query: 1 LIQQWEGLTVATIUCQWATEMRDHPDFKQFVIJSPTHRIVFIDYEGFKLPVQWKSR 54 

++++W+GL T +W EMR++ F +V+NPTH++VFI+ EGF+ F++WK + 
Sbjct: 21 ILKRTOGIJIKYmNRWIKEMRENRTFSiynirVINPTHKLVFIN^^ 74 

25 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2650 

A DNA sequence (GASx946) was identified in S.pyogenes <SEQ ID 7799> which encodes the amino acid 
30 sequence <SEQ ID 7800>. Analysis of this protein sequence reveals the following: 

Possible site: 16 



35 



40 



»> Seems to have an uncleavable N-term signal seq 

INTEGRAL Likelihood = -4.99 Transmembrane 3 - 19 ( 1 - 23) 



Final Results 

bacterial membrane Certainty=0 .2996 (Affirmative) < suco 

bacterial outside Certainty= 0.0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

45 Example 2651 

A DNA sequence (GASx950) was identified in S.pyogenes <SEQ ID 7801> which encodes the amino acid 
sequence <SEQ ID 7802>. Analysis of this protein sequence reveals the following: 
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Possible site: 51 

>>> Seems to have no N-terminal signal sequence 



25 



5 Final Results 

bacterial cytoplasm Certainty=0. 2211 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty^O. 0000 (Not Clear) < suco 

10 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2652 

15 A DNA sequence (GASx951) was identified in S.pyogenes <SEQ ID 7803> which encodes the amino acid 
sequence <SEQ ID 7804>. Analysis of this protein sequence reveals the following: 

Possible site: 30 

>» Seems to have no N-terminal signal sequence 

20 



Final Results 

bacterial cytoplasm Certainty=0 . 4258 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No correspondiQg DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefial 
antigens for vaccines or diagnostics. 

30 Example 2653 

A DNA sequence (GASx952) was identified in S.pyogenes <SEQ ID 7805> which encodes the amino acid 
sequence <SEQ ID 7806>. Analysis of this protein sequence reveals the following: 

Possible site: 46 

35 >» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 2476 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

40 bacterial outside Certaintys:0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAF74110 GB:AF212847 ORF245 [Lactococcus lactis bacteriophage 
45 U136.2] 

Identities = 82/265 (30%) , Positives = 128/265 (47%) , Gaps = 27/265 (10%) 

Query: 1 MaNQLSTQQVKRDITTDPTLLTGADIKKJfFDPQlinJLSEKQVGQALALCKGRm 60 
MAN+L V L IK+Y D S+ ++ + LCK N+NPF EV 

50 Sbjct: 1 MaNELGIFSVDN LtMTIKQVIiDGGGKASmELVIiLINLCKQNNMNPFMKEV 52 
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Query: 



Sb j ct : 



53 



61 



YITOYKIMSGTDFSLIVSKEAFMKRAERCEGTOGFEAGITVM-RNGEMVEIEGSLKLPDD 119 
Y + Y N ++VS++ + KRA + + G E G+ V+ ++G + EG+ K + 

YFIKYGNQPA QIWSRDFYRKRAFQNPNFVGIEVGVIVLNKDGVLEHNEGTFKTHEQ 109 



Query: 



Sbjct: 



110 



120 



VLIGGWAIVYRKDRSHRYKVIVDEtraYA/KLDKyGNPRSTWKSMPGTMIRKTALVQTLR^ 179 

L+G Wft. V+ K+ V V ++EYV++ K G+P W + P TM+ K A Q LR A 

ELVGAWARVHLKinEIPVYVAVSYDEYVQM-KDGHPNKMWTNKPCTMLGKVAES^ 168 



Query: 



Sbjct: 



169 



180 



FPDELGNMYTDIDGGDTFDAIKDVTPQETQEEVRARK MAQIEQYKQEQ- -TQKQTQK 234 

FP E y + + + P++ EV K AQIE + +E +K + 

FPAEFSGTYGEEEYPE PEKEPREVNGVKEPDRAQIESFDKEDYAAKKIEEL 219 



Query: 235 ADTSYPVDEVSEHTDDPVQGELIjDG 259 

++P EVET++ E L+G 
Sbjct: 220 KEKAQPQKEWEETGEVTDEEPLEG 244 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 

Example 2654 

A DNA sequence (GASx953) was identified in S.pyogenes <SEQ ID 7807> which encodes the amino acid 
sequence <SEQ ID 7808>. Analysis of this protein sequence reveals the following: 

Possible site: 13 

»> Seems to have no N-termxnal signal sequence 



No conesponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAF74111 GB:AF212847 0RF364 [Lactococcus lactis bacteriophage 
U136.2] 

Identities = 67/222 (30%) , Positives = 120/222 (53%) , Gaps = 3/222 (1%) 

Query: 1 MQELQLKVTQAQVEIIDREKFEQNINEVVAKYQNYAVTAGTIKDDKQVLADLRKLKKQriS 60 

++++++ A + I++ EKF+ +IN+VVA+Y + + + D++ A L KL ++ 
Sbjct: 19 VKDIEIDFKPAIINILEEEKFKASINQVVAEYTGHVPSVENLTVDRKTRASiareLITKIE 78 

Query: 61 DERIKVKKELSKPADDIDGYIKQASKPLDDTIDKIATDVKEFEDHQKRLRLDTVKSYLSN 120 

R ++KK ++ P + +G+ K+A P++ 1+ I +K+ E QK R V L 
Sbjct: 79 TRRKEIKKSINVPYAEFEGWYKKAIAPMEKVIETIDAGIKKIEAEQKESRKKVVHELLVE 138 

Query: 121 KASEYMLDPRIFDEKAMEYTKAGNFMADGVTLKKVTMKSLEDLVTFEYQKEQEVEKAKAT 180 

++ +D RIF+ ++ K+ NF + + KK + S+ ++ E QK E + AK + 
Sbjct: 139 LTTDTEVDSRIFENFVDDWAKSSNF--NDIKPKKQLIDSITYVIDGEKQKIAEYKSAKQS 196 

Query: 181 ISGQCSffiYGMTDQPYlRMLKE-MTLVEVLGQIKADYLAERQK 221 

IS C +T PYIRML T+ E++ I D L EKQ+ 
Sbjct: 197 ISDFCFQqNITSTPYIRMLDSGKTVSEIMAVITEDVLFEKQR 238 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 



Final Results 



bacterial cytoplasm Certainty=0 . 3413 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 
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Example 2655 

A DNA sequence (GASx954) was identified in S.pyogenes <SEQ ID 7809> which encodes the amino acid 
sequence <SEQ ID 7810>. Analysis of this protein sequence reveals the following: 

Possible site: 56 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — CertaintYi=0. 3884 (Jiff irmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside — Certainty=0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2656 

A DNA sequence (GASx955) was identified in S.pyogenes <SEQ ID 7811> which encodes the amino acid 
sequence <SEQ ID 7812>. Analysis of this protein sequence reveals the following: 

Possible site: 34 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 1777 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty^O . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2657 

A DNA sequence (GASx956) was identified in S.pyogenes <SEQ ID 7813> which encodes the amino acid 
sequence <SEQ ID 7814>. Analysis of this protein sequence reveals the following: 

Possible site: 16 

>» Seems to have no N-terminal signal sequence 

INTEGRAL Likelihood = -2.44 Transmembrane 82 - 98 ( 81 - 98) 

Final Results 

bacterial membrane Certainty^O . 1977 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty^O . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 
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Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2658 

A DNA sequence (GASx958) was identified in S.pyogenes <SEQ ID 7815> which encodes the amino acid 
5 sequence <SEQ ID 7816>. Analysis of this protein sequence reveals the following: 

Possible site: 34 

»> Seems to have no N-terrainal signal sequence 

10 Final Results 

bacterial cytoplasm Certainty=0. 3 673 (Affirmative) < suco 

bacterial membrane — Certainty=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

15 No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2659 

20 A DNA sequence (GASx960) was identified m S.pyogenes <SEQ ID 7817> which encodes the amino acid 
sequence <SEQ ID 7818>. Analysis of this protein sequence reveals the following: 

Possible site: 36 



25 



30 



»> Seems to have no N-terminal signal sequence 



Final Results 

bacterial cytoplasm Certaintyi=0 . 1852 (Affirmative) < suco 

bacterial membrane — Certaintya=0. 0000 (Not Clear) < suco 
bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

35 Example 2660 

A DNA sequence (GASx961) was identified in S.pyogenes <SEQ ID 7819> which encodes the amino acid 
sequence <SEQ ID 7820>. Analysis of this protein sequence reveals the following: 

Possible site: 45 

40 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 7380 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

45 bacterial outside Certainty=0. 0000 (Not clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 
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The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAF63071 GB:AF158600 gpl37 [Streptococcus thermophilus 
bacteriophage Sfill] 
Identities = 67/136 (49%) , Positives = 97/136 (71%) , Gaps = 2/136 (1%) 

Query: 5 PEIDIQKTKSNAKRKLREYPRWRRIAITOVDTQKVTATYSFEPRQPHGTPSKPVERL^^ 64 

PEID + T KRKLREYPRWR lA+D QK+T ++F PR G +KPVE +A+ R 
Sbjct: 4 PEIDEKATLKRCKRKLREYPRWREIAHDSAEQKITQEFTFMPRG--GGVNKPVENIAVRR 61 

Query: 65 VSAEQKLDTIERAVNGIFDPEYRLILIDKYLLTYPKTDCBIYTKLGYEKSQYyNMLDNAL 124 

V A EL+ IE+AVNG++ P+YR ILI+KYL PK + I +G+E++ + +L+N++ 
Sbjct: 62 VDAIJTOLEAlEQAVNGLYRPDYRRILIEKYiaYPPKPlWQIAQSIGFERTAFQELLKK 121 

Query: 125 LSFSELYKEGMLLVEK 140 

L+F+ELY++G L+VE+ 
Sbjct: 122 LAFAELYRDGRLIVER 137 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2661 

A DNA sequence (GASx962) was identified in S.pyogenes <SEQ ID 7821> which encodes the amino acid 
sequence <SEQ ID 7822>. Analysis of this protein sequence reveals the following: 

Possible site: 16 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 . 3375 (Affirmative) < suco 

bacterial membrane Certainty=0. 0000 (Not Clear) < succ> 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2662 

A DNA sequence (GASx963R) was identified in S.pyogenes <SEQ ID 7823> which encodes the amino acid 
sequence <SEQ ID 7824>. Analysis of this protein sequence reveals the following: 

Possible site: 48 

>» Seems to have an uncleavable N-term signal seq 

Final Results 

bacterial metnbrane Certainty^O. 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0 . 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, eould be useful 
antigens for vaccines or diagnostics. 
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Example 2663 

A DNA sequence (GASx964) was identified in S.pyogenes <SEQ ID 7825> which encodes the amino acid 
sequence <SEQ ID 7826>. Analysis of this protein sequence reveals the following: 

Possible site: 51 

5 

»> Seems to have a cleavable N-term signal seq. 

INTEGRAL Likelihood = -6.16 Transmembrane 90 - 106 ( 89 - 111) 
INTEGRaL Likelihood = -5.52 Transmembrane 131 - 147 ( 129 - 150) 
INTEGRAL Likelihood = -0.43 Transmembrane 53 - 69 ( 52 - 69) 

10 

Final Results 

bacterial membrane Certainty=0. 3463 (Affirmative) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

bacterial cytoplasm Certainty=0. 0000 (Not Clear) < suco 

15 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has no significant homology with any sequences in the GENPEPT database. 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

20 Example 2664 

A DNA sequence (GASx965) was identified in S.pyogenes <SEQ ID 7827> which encodes the amino acid 
sequence <SEQ ID 7828>. Analysis of this protein sequence reveals the following: 

possible site: 15 

25 »> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty^O. 3944 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

30 bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:CAAS6779 GB:X98106 Rorfl72 [Bacteriophage phigle] 
35 Identities = 36/82 (43%) , Positives = 52/82 (62%) , Gaps = 3/82 (3%) 

Query: 18 ELTEKQQRFVDKYlTTEISaTESAKQaGYSEKSAySQGQRLLKNVEIQKaMKERF^ 77 

+LT KQQ+F D+yi + NR ++A++AGYS++SA S GQ L +I++ + ER + 
Sbjct: 4 KLTPKQQKFADEYIKSGNAADAftRKftGySKRSARSVGQENLTKPDIKQYIDERM---DEI 60 

40 

Query: 78 KGDRIQDVAETLEQDTSIARGE 99 

RI D E +E T lARGE 
Sbjct: 61 ASKRIMDATEAVELLTRIARGE 82 

45 Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2665 

A DNA sequence (GASx966) was identified in S.pyogenes <SEQ ID 7829> which encodes the amino acid 
sequence <SEQ ID 7830>. Analysis of this protein sequence reveals ttie following: 

50 Possible site: 36 
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»> Seems to have no W-temiinal signal sequence 



Final Results 

bacterial cytoplasm Certainty=0 .2389 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0. 0000 (Not Clear) < suco 



No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

10 >GP:CAB13115 GB:Z99110 PBSX defective prophage terminase (large 

subunit) [Bacillus subtilis] 
Identities = 117/417 (28%) , Positives = 195/417 (46%) , Gaps = 33/417 (7%) 





Query: 


31 


YRWKGSRGSKKSKTTAIOTIVRLLKypWANLLVIRRYSKnJKQSTYTDFKOTACNQLKOT 


90 


15 






Y+ + G GS KS TAL +++LLK LVIR +T++ ST+ F+ +Ii +T 






Sbjct: 


21 


YQFLVGGYGSSKSyHTJUIjKIVIiKLLKEK-RTftLVIREVFDTHRDSTFALFQEVIEELGLT 


79 




Query: 


91 


HLFKFIffiSLPEITVKATGQKILPRGLDDELKITSITVDVGALCWAWFEEAYQIETEDKFS 


150 








S ++ G +I+F+G+D+ K+ S V + W EE +++ E 




20 


Sb j Ct : 


80 


KAVASLSSPLQLRFH-NGSRIMFKGMDNPAKLKS VHNISLIWIEECSEVKYEG- - - 


131 




Query: 


151 


TWESIRGSLDAPDFFKQITVTPNPWSERHWLKRVFFDEETKR 


193 








+ + G L P+ + T NP +W R FF +E K+ 




25 


Sbjct: 


132 


--FKELIGRLRHPELKLHMICTTNPVGTSNmYRHFFRDERKKRFVIiDDSELYEKRTIVK 


189 




Query: 


194 


ADTFSGTTTFRVNEWLDDVDKRRYEDLYKTNPRRARIVCDGEWGVAEGLVFDNFEWDFD 


253 








DT+ +T N +L + ++ + L + +P RI G +GV V FEV+ D 






Sbjct: 


190 


GDTYYHHSTAOTmFLPESYVKQLDGLKEYDPDLYRIARKGRFGVNGlRVLPQFEVLPHD 


249 


30 


Query: 


254 


-VEKTIQRVKET--SAGMDFGFTQDPTTLICVAVDLANKELWLYjNEHYQKAMLTDHIVKM 


310 








V+K I + + GMDFGF + ++ +AVD K L++Y E+YQ M D + 






Sbjct: 


250 


QVKKCIAAISKPIFRTGMDFGFEESYNAVVRIAVDPEKKYLYIYWEYYQNKMTDDRTA^^ 


309 




Query: 


311 


IRDKNLHRSYIAGDSAEKRLIAEIKSKGVSGIVPSIKGKGSIMQGIQFMQGP-KIYIHPS 


369 


35 






+R+ + I DSAE +1 + +G +V + K GS +Q + ++ F KI+ 






Sb j ct : 


310 


LREFIETQELIKADSAEPKSIQYFRQQGFR-MVGARKFPGSRLQYTKKVKRFKKIPCSDR 


368 




Query: 


370 


CEHTIEEFimTFKQDKEGNWLNEPIDKNNHVIDAIRYAIiEiCmiRSJffiSNQFEVL^ 426 








CE+ IE T T+ +DK G + + + H + AI YAL+ Y + + + +R 




40 


Sbjct: 


369 


<33SWIYELETLTYAKDKNGALIEDEFTIDPHTLSAIWYALDDYEVADMKETAHKRMR 425 



Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2666 

45 A DNA sequence (GASx967) was identified in S.pyogenes <SEQ ID 783 1> which encodes the amino acid 
sequence <SEQ ID 7832>. Analysis of this protein sequence reveals the following: 

Possible site: 32 



»> Seems to have no N-terminal signal sequence 

50 

Final Results 

bacterial cytoplasm Certainty=0. 4899 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

bacterial outside Certainty=0 . 0000 (Not Clear) < suco 

55 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 



>GP:AAC34397 GB:AF158600 gp502 [Streptococcus thermophilus 
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bacteriophage Sfill] 
Identities = 67/114 (58%) , Positives = 83/114 (72%) 

Query: 6 FRDSTGKTKTLEFRFHREftRMRYQAESLESLLTEKYKLLREMIEHHDKVQKPRIQELLDY 65 

F DSTG+ L RFHRE+R+RY+A++LE L+ ++LL+ I HH Q PRIQELLDY 
Sbjct: 7 FTDSTGQDLVLNLRFHRESRIRYRMliajEElJIVNNWELLKNFII^ 66 

Query: 66 AEGNNHTISEIGRRKDDDMADVEAVHNYGKYISTLKQGYLVGNPIRVEYIDGTE 119 

A G NH + + GRRKD++MAD RAVHNYG+ IS K GYL GNPIRVEY D + 
Sbjct: 67 ARGENHDVLKSGRRKDNEMADKRAVHNYGRMISKFKTGYLAGNPIRVEYDDNED 120 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be useful 
antigens for vaccines or diagnostics. 

Example 2667 

A DNA sequence (GASx968) was identified in S.pyogenes <SEQ ID 7833> which encodes the amino acid 
sequence <SEQ ID 7834>. Analysis of this protein sequence reveals the following: 

Possible site: 34 

>» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm — Certainty=0. 4007 (Affirmative) < suco 

bacterial membrane Certaintyi=0. 0000 (Not Clear) < suco 

bacterial outside — Certainty=0. 0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

>GP:AAC34397 GB:AF158600 gp502 [Streptococcus thermqphilus 
bacteriophage Sfill] 
Identities = 172/319 (53%) , Positives = 227/319 (70%) , Gaps = 9/319 (2%) 

Query: 1 LIYRS^roDKTEVVRIlDPREVFVIYGNNLEQSSLAGVRYYNKNQLIX3TTKIVEIlYTD]ra 60 

+IYRS D+T + RL P E FVIY N+LE +S+A VRYYN+ L +VE+YT+ I 

Sbjct: 157 VIYRSEYDETRIKRLSPLETFVIYDNSLEDNSIAAVRYYNRGTLQ^laKDVVEIYTNQHIy 216 

Query: 61 KFEYDGDLTPIGETSSHAFGSVPITEYLNTDDGMGDYETELSLIDDYDAAQSDTANYMQD 120 

+ IT HAFG+VPITE+LN DG+GDYETEL LIDLYD+A+SDTAN+M D 

Sbjct: 217 TLDASDSFNEISVTP-HAFGTVPITEFLNNADGIGDYETELYLIDLYDSAESDTANHMSD 275 

Query: 121 LSnAIIAIIGRVSFPGYVDTAEKAIEYLRKMRKAELLNIiEPPVDQDGREGSVDAKYLyKQ 180 

++DAILAI G ++ P + ++ M++ RL+ L+PP DG+EG+V A+YL K 

Sbjct: 276 MADAILAIYGDIALPOGMQASD MKRTRLMQLKPPKSftDGKEGTVKAEYLTKS 327 

Query: 181 YDVQGTEAYKNRIVSDIHKFTNTPDMTDSKFAGQQSGEftLKWKVFGLDQERVDMQALFEQ 240 

YDV G EAYK R+ DIH FTNTPDM+D+ F+G SGEALK+K+FGLDQ+RVD Q+ F Q 
Sbjct: 328 YDVSGAEAYKTRXJTOIHVFmrPDMSnNHFSGNASGEAIiKYKLFGLDQDRVDTQSQFTQ 387 

Query: 241 SLKRRYKLIARVSQLLKEIDDFDISKLKITFTENLPKSLQEKIEAFKALGGELSQETAMA 300 

LKRRY+L AR+ L+ E DFD S+LKITFTPNLPKSL E++ LGG++SQETA++ 
Sbjct: 388 GLKRRYRLiUUlIGSLVITOFKDFDESRLiaTFTPNLPKSLYEQVSlIiNDLGGQVSQETALS 447 

Query: 301 ITDIVEDAKKEISLINSES 319 

++ +VE+ +E+ IN ES 
Sbjct: 448 LSGLVENPTEELDKINEES 466 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefiil 
antigens for vaccines or diagnostics. 
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Example 2668 

A DNA sequence (GASx969) was identified in S.pyogenes <SEQ ID 7835> which encodes the amino acid 
sequence <SEQ ID 783 6>. Analysis of this protein sequence reveals the following: 

Possible site: 21 

5 

»> Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0. 5307 (Affirmative) < suco 

10 bacterial membrane Certainty=0. 0000 (Not Clear) < suco 

bacterial outside Certainty= 0.0000 (Not Clear) < suco 

No corresponding DNA sequence was identified in S.agalactiae. 

The protein has homology with the following sequences in the GENPEPT database: 

15 >GP:AAC79543 GB:D88974 0RF28 [Streptococcus thermqphilus tettperate 

bacteriophage 01205] 
Identities = 118/309 (38%) , Positives = 183/309 (59%) , Gaps = 18/309 (5%) 

Query: 8 YWRDRIKKEMDAK-EADDISLEQSMKQLHDYHFRNIEKEIESFYQRYADKEKIDLSEARK 65 
20 YW R +E +A + + ++ ++ L++ + KE++++ Q+YA+K + +S+A++ 

Sbjct: 3 YWSKRTLREREASIKKGEAEFiCKELEALYHLQLSQIiRKEIiDAyiQKYANKNGLSVSDAro 62 

Query: 67 RASELDISayQKmKEI.VAKaEKIlRREGKIVTRDDFTHQE^MMSIYOTlAMKT^M 126 
+A D+ A++ KAK VR DP+ + N ++ YN +M ELL 

25 Sbjct: 63 KADSEDVKAFETKAKRYVADK DPSPKANRELQDYNFSMSVGRQELLI 109 

Query: 127 LNIDLEMQELANGEHKLTKKFLDEGYRKETEFQAGLLGLSVASQASVKSLRDAVINANFK 186 

++LE+ L+ E +LT +L Gy+ E + LL +V S +++ A +NANF+ 
Sbjct: 110 QELELELLALSESERQLmjyLTNGYKSEV-WESLLDQTVPSGKTLEKYMKAAVNANPE 168 

Query: 187 GAKWSDNIWDRQDKLRSIlSQSVQSAILKGKNGLTIARDIRREPDVSASyAKRLAITEHA 246 

GA+WS+ IW RQ++LR 1+ V A+++G+NGLTIAR IR+ D S + A+RLAITEHA 
Sbjct: 169 GREWSERIWKRQEQLRKIVKTEVTRALIRGENGLTIARRIRKHMDASRTEAERLAITEHA 228 

35 Query: 247 RVQMEVGRLSMAENGFAMFDILPEPKACDVCKDIAKH---GPYHLDKWRIGENSPPFHPY 303 

RVQ M ENGF F ++PE +ACD+CKDI K P + IG N+PP HPY 

Sbjct: 229 RVQTLAQESIMKENGFEHFKLMPESRACDlCKDIGKETEKNPVKIADMEIGTKAPPIHPy 288 

Query: 304 CRCAIVGVD 312 
40 CRCA+V V+ 

Sbjct: 289 CRCAWEVE 297 

Based on this analysis, it was predicted that this GAS-specific protein and its epitopes, could be usefijl 
antigens for vaccines or diagnostics. 

45 Example 2669 

A DNA sequence (GASx970) was identified in S.pyogenes <SEQ ID 7837> which encodes the amino acid 
sequence <SEQ ID 7838>. Analysis of this protein sequence reveals the following: 

Possible site: 15 

50 >» Seems to have no N-terminal signal sequence 

Final Results 

bacterial cytoplasm Certainty=0 .2091 (Affirmative) < suco 

bacterial membrane Certainty=0 . 0000 (Not Clear) < suco 

55 bacterial outside Certainty=0 . 0000 (Not Clear) < suco 



30 



No corresponding DNA sequence was identified in S.agalactiae. 



